In this short blog, I share seven papers that focus on detecting Dictionary Domain Generation Algorithm (DGA) domains, A.K.A. Word-based DGAs. Dictionary DGAs are algorithms seen in various malware families (suppobox, matsnu, gozi, rovnix, etc.) that are used to periodically generate a large number of domain names that use pseudo-randomly concatenated words from a dictionary. These domains may appear legitimate at first glance and are often able to evade blacklisting as well as traditional DGA detections based on entropy or counts of consonants vs vowels. Below are a small sample of rovnix domains from Unit42’s blogpost.
- kingwhichtotallyadminis[.]biz
- thareplunjudiciary[.]net
- townsunalienable[.]net
- taxeslawsmockhigh[.]net
- transientperfidythe[.]biz
- inhabitantslaindourmock[.]cn
- thworldthesuffer[.]biz
Papers:
- Real-Time Detection of Dictionary DGA Network Traffic using Deep Learning
- A Word Graph Approach for Dictionary Detection and Extraction in DGA Domain Names
- Dictionary Extraction and Detection of Algorithmically Generated Domain Names in Passive DNS Traffic
- Inline Detection of Domain Generation Algorithms with Context-Sensitive Word Embeddings
- An Evaluation of DGA Classifiers
- A Novel Detection Method for Word-Based DGA
- A Word-Level Analytical Approach for Identifying Malicious Domain Names Caused by Dictionary-Based DGA Malware
In a previous post, I also shared details on several models that are capable of effectively detecting dictionary DGA domains as well. Please see Auxiliary Loss Optimization for Hypothesis Augmentation for DGA Domain Detection.
Lastly, if you’re interested in discovering more interesting papers like these, use the method I outlined here.
–Jason
@jason_trost
The “short links” format was inspired by O’Reilly’s Four Short Links series.