1. Introduction to the DGA Algorithm
The Gcenter embeds an engine capable of detecting domain names generated by DGAs (Domain Generation Algorithm). The presence of DGA-generated domain name resolution on a network is a strong indicator of being compromised.
Indeed, malware can use HTTP requests to automatically generated domain names to contact their command and control servers. They are also called CnC, C&C, or C2. These domain names contain different properties than legitimate domain names. Conventional detection approaches, such as blacklists, are not relevant in the case of continuously renewed domains. Simple entropy calculations result in a large number of false positives.
2. Activation
Menu: Administrators > GCENTER > ML Management > DGA Detection Management > Settings
This feature is disabled by default. It can be enabled on the Machine Learning dashboard.
Once activated, the domain names present in the 'dns' events captured by the GCAP probes are analysed by the machine learning engine. The machine learning engine calculates a probability for each such event indicating whether the domain name was generated by a DGA. The engine uses a pre-trained model, whose architecture is based on deep neural network type Long Short Term Memory (LSTM) networks.
The engine only uses domain names. No additional contextual information such as NXDomains for example is involved.
3. Exception list
Menu: Administrators > GCENTER > ML Management > DGA Detection Management > White List / Black List
Exception lists can be set up to force the engine to declare domain names as healthy (White List). This enables eliminating alerts related to recurring false positives.
Conversely, a black list enables an alert to be raised for a domain that would not otherwise have been detected (false negative).
From Add a single domain name, it is possible to include a domain in the Machine Learning whitelist via the Domain name field. A remark can follow the added domain for more details in the Comment field.
The changes are recorded by clicking on the Save button.
From the Add a set of domain names, the administrator updates the Machine Learning whitelist via the List of domain names field by selecting a CSV file containing the domains. It is necessary to use ';' to separate the various elements of the list.
Furthermore, the administrator can decide to delete the previous list by ticking the Clean previous list? box and record all changes by clicking Save.
4. Generated events
The machine learning engine enriches the information already provided by the Sigflow module. Thus, for a domain that is not detected as a generated domain, the dga_probability
field will be added. A value close to 0
indicates a low probability the domain was generated as in the following example:
On the other hand, a value close to 1
indicates that there is a good chance this domain was the result of a random generation as in this case: