Implementing RITA using KQL
RITA is an open-source network traffic analysis tool for detecting C2. In this post, I'll explain how its beacon analyzer works and implement the same algorithm in Azure Sentinel using KQL.
Last updated
Was this helpful?
RITA is an open-source network traffic analysis tool for detecting C2. In this post, I'll explain how its beacon analyzer works and implement the same algorithm in Azure Sentinel using KQL.
Last updated
Was this helpful?
RITA analyzes network traffic between the same source-destination pair and uses a scoring algorithm. The scoring algorithm consists of 2 scores and a final score that is calculated using those 2 scores.
Beacons have symmetric time delta and size distributions. Bowley skewness formula is used to check the symmetry and a score is calculated for skewness(tsSkewScore). The score is calculated as 1 - tsSkewScore
. Here is a short video that explains the Bowley skewness calculation:
In addition to the skewness, beacons have very low dispersion around the median of time deltas. Median Absolute Deviation about the Median(MADM) is used to check dispersion and a score is calculated for the dispersion(tsMADMScore). The score is calculated as 1 - tsMADM/30.0
. 30 seconds is taken as a threshold for low dispersion. If the MADM is greater than 30 seconds, the score will be 0. Here is a video that explains how MADM is calculated with a scientific calculator:
Lastly, beacons have a relatively high connection count when comparing to normal traffic. Therefore, a score, ConnectionCountScore, is calculated using the total duration of the traffic and the connection count(duration is calculated using the first and last connection timestamp). The score is calculated as follows:
If the score is greater than 1, it is rounded down to 1. The idea is the more frequent the connections, the more likely beaconing the connection is. The frequency reference is assumed to be 10 seconds here.
The ConnectionScore is calculated as (tsSkewScore + tsMADMScore + ConnectionCountScore) / 3.0
. The score is multiplied by 1000 and divided by 1000 to represent the score with 3 digits after the delimiter.
Data size dispersion(dsSkewScore) and MADM(dsMADMScore) are calculated with the same methods explained above. dsMADMScore of data sizes is calculated by taking 32 bytes as a threshold. In addition, the data size smallness score, dsSmallnessScore, is calculated since the beacons have a small data size in general. dsSmallnessScore is calculated as:
If the Mode of the data sizes is greater than 65Kb, the score will be 0. Mode is the value that occurs most in a data set. Here is a short video that explains the Mode in statistics:
The DataSizeScore is calculated as (dsSkewScore + dsMADMScore + dsSmallenessScore) / 3.0
. The score is multiplied by 1000 and divided by 1000 to represent the score with 3 digits after the delimiter.
The final score is calculated by taking the average of ConnectionScore and DataSizeScore:
A more detailed explanation can be found in the RITA beacon analyzer code:
There is a false negative potential in RITA:
If the MADM of connection intervals is greater than 30 seconds, the tsMADMScore score will be 0.
If the MADM of the data sizes is greater than 32 bytes, the dsMADMScore will be zero.
A beacon with 10 minutes sleep and 20% jitter might have a MADM greater than 30 seconds. This lowers the overall score, causing you to skip or miss it. I've already explained this situation in my previous blog:
KQL doesn't have built-in functions to calculate MADM, Mode, and Skewness for a data set. Fortunately, it is possible to calculate these values using built-in functions and performing iterations over the same data set. Using the same approach in the previous beaconing detection, I've developed a KQL query that uses the same scoring algorithm.
Hopefully, I'll make improvements to the RITA query and write a new blog soon.
Happy hunting!
Since it's quite difficult to explain how it works here, I've put explanations in the KQL query itself. I've used VMConnection table in the so that you can run it there and see how it works easily. I've also added a quick guide on how to customize the query based on the logs you have.
Neither using standard deviation and jitter calculation nor RITA seems perfect. I'm working on combining both approaches and making some additions to improve the detection.
You can find RITA query in .