Introduction
VMRay Analyzer version 4.5 adds the capability to extract malware configurations. In this blog post we take a deep dive into malware configurations: what are they, how can they be used, and how VMRay Analyzer extracts and presents them.
Figure 1. The malware configuration of an Agent Tesla sample, extracted by VMRay Analyzer 4.5
How Do I Use an Extracted Malware Configuration?
The configuration of a malware sample defines how the malware behaves. Automatically extracting the configuration brings many benefits to defenders.
Malware configurations contain the highest fidelity IOCs that are possible to automatically generate. These IOCs can be used for threat hunting or even blocking. The malware’s configuration includes all of its C2 addresses, and often other indicators such as registry keys, mutexes and filenames. If we are defending an organization, then scanning the logs for these IOCs can reveal attacks that would otherwise remain hidden, and blocking extracted URLs can be useful in cutting off malware from its C2.
The configuration often completely describes the malware’s behavior : it can show activated features such as keylogging, or that the logged data is exfiltrated every hour over a Telegram bot. During a sandbox execution the behavior may be incomplete due to the shortness of time, but the extracted configuration still reveals what the malware would do if we allowed to run longer on a real host.
Extracted malware configurations provide extremely high-confidence malware family classification . This is because configuration extraction is malware family specific, and relies on correct family classification. Once the family is identified, the right config extractor can do the necessary de-obfuscation and family-specific data parsing. If the family was identified incorrectly, then the configuration extraction would have also failed. A successful configuration extraction proves that the underlying family classification was correct, and not just a weak YARA rule causing an FP.
Extracting malware configurations at scale can reveal connections among samples, and a deeper insight about a malware family and its development. Similar configuration values, such as reused cryptographic keys can lead to elaborate grouping that can be used to discover that two seemingly unrelated incidents belonged to the same botnet or even the same actor.
Figure 2. Years of Ursnif samples clustered based on data extracted from their configuration in a VMRay-SANS webcast last year
What are Malware Configurations?
To understand malware configurations, we should first look into how malware is typically generated with malware builders.
Countless different malware samples are used in the wild every day, but they are not all that different as they might first seem. Malware development is a long, resource-intensive process, and it’s almost never worth it for a malware developer to create malware just for a single attack. Instead, malware developers create so-called malware builders. People using the builder can conveniently configure and generate new malware that fit their needs. The collection of malware samples generated by the same builder is referred to as a malware family.
A malware builder allows its user to configure options that make the malware sample unique: which C2 URLs to connect, what malicious behaviors are enabled, how persistence is achieved, how to exfiltrate data, what evasion methods are enabled, and anything else that the malware developer implemented. The builder also adds its own automatically generated data, such as encryption keys. All of this valuable configuration data is stored somewhere within the malware, typically obfuscated. VMRay Analyzer now automatically extracts and parses these configurations for supported malware families.
Figure 3. NanoCore malware builder graphical interface
Results of VMRay Analyzer’s Configuration Extraction
VMRay Analyzer extracts configuration from supported families, and presents them in three ways:
A table on the user interface,
Feeding data to the VMRay IOC feature that distinguishes actionable IOCs from artifacts,
Downloadable as JSON files in a standard format defined by the US Defense Cyber Crime Center in the MWCP project .
Figure 4. Extracted configuration for the malware NanoCore as seen on the analysis overview
As an example, the configuration above shows the data that was extracted from a NanoCore sample. It first includes common data, such as the version of the malware, its mutex and the socket used to connect home, and timings such as how much time to wait between C2 connections. The table also shows family-specific data, such as which of the malicious features are enabled. In the table we can see that the malware is configured to execute on startup, attempts to bypass UAC, clears the Zone identifier, prevents the system from going to sleep, and uses 8.8.8.8 as a DNS server instead of the one configured in Windows.
High-quality automated IOC generation
VMRay Analyzer distinguishes between artifacts and IOCs. In VMRay’s terminology:
Artifacts are all files, filenames, URLs, IPs, registry keys and mutexes that we saw during a dynamic analysis, or extracted statically.
IOCs are a subset of these artifacts that our rules found to be important, and can be useful in describing the threat.
This allows filtering out useless sandbox artifacts, and having only a list of IOCs that matter.
The configuration extractors also add new artifacts and IOCs based on the information they found within the malware. Such as in the extraction above, we found two IOCs: an URL and a mutex. The extractor has added both of them as malicious IOCs.
This means that when the family is supported by configuration extractors, we will have all IOCs such as URLs that were part of the configuration, even if during the sandbox execution there was not enough time to reach them. The differentiation between IOCs and artifacts also becomes way more accurate, such as the extractors allow us to differentiate between benign network connections and callbacks to C2 URLs found in the configuration.
Figure 5. URL IOC added by the configuration extractor
Supporting malware analysis at scale
When analyzing huge number of malware samples, we want to receive malware configurations in a well-defined, predictable, industry-standard format that can be easily integrated into a security system. After researching all available options we could find, we settled on using the output format defined by the US Defense Cyber Crime Center’s MWCP project .
The format has many advantages that we think make it the best choice:
The file format is not vendor-specific, it was designed to be portable among security products. This makes it easier to integrate VMRay Analyzer into an existing system than if we used a custom file format.
Authors of the file format have a realistic view of malware configurations: they realize that certain types of configuration elements such as C2 URLs and encryption keys appear among different malware families, but they also leave room for adding data that is specific to the malware family using the type “other”.
MWCP is an open-source project with community support. Other formats were typically proprietary to a single sandbox project, or were open-source but abandoned, or just very young.
The format definition is complete and verifiable with an open source JSON schema.
Figure 6. MWCP-style configuration JSON
VMRay’s Malware Configuration Extraction: The Underlying Data
The secret sauce that makes VMRay’s malware configuration extraction work well is the very high quality underlying data, produced by an elaborate monitoring system. Malware developers are aware that the configuration data is valuable, and often try to hide it with layers of obfuscation and evasion. To extract the configuration, some de-obfuscation and parsing steps are done by the sandbox’s monitor, and the final parsing steps must be implemented manually by the VMRay Labs team family-by-family. Since the data produced by the sandbox’s monitor is very high quality, there are less steps left to implement manually, and extraction becomes more robust and resistant to changes. VMRay’s monitoring technology helps extractors in three core ways:
Our Smart Memory Dumping feature generates memory dumps that bring the most value with a minimal impact on performance.
VMRay configuration extractors don’t just rely on memory dumps, but also apply all other data produced by the malware during its execution, This makes our configuration extractors able to handle cases where the configuration is not within the executed malware: such as receiving an updated configuration from the C2 server, or when the configuration is provided as a command line option, such as with miners like XMRig.
VMRay’s hypervisor-based monitoring also brings benefits: it’s quite resistant to sandbox evasion techniques, and it provides an accurate log of API calls that extractors can use to get data and memory addresses that would be challenging to find otherwise.
Conclusion
Based on its unique monitoring technology, VMRay Analyzer extracts malware configurations that provide data that is actionable, reliable, and easy to integrate into existing systems and security automation.
Appendix: Supported Malware Families at Launch
We continuously add new extractors, maintain existing ones and release changes regularly with Signature & Detection updates to both VMRay Cloud and On-premise. With the 4.5 release, we provide configuration extraction for the following malware families:
Agent Tesla
AsyncRAT
Cobalt Strike
Emotet
Formbook / XLoader
Hancitor
HawkEye
Lokibot
NanoCore
njRAT
PredatorPain
Qbot
Raccoon
Redline
Remcos
Smoke Loader
Snake Keylogger
Warzone
XMRig