A Deep Dive into Automated, Customizable Threat Scoring
In this second blog post about what’s new in V 1.10 we drill down into our VMRay Threat Identifier (VTI) engine and its threat scoring. It automatically identifies and flags malicious behavior using VTI rules, generating an overall severity score of malicious behavior. The VTI engine has been part of our analysis system from the start but with this release, we now support custom scoring rules.
We’ll show this first by doing an analysis of a CryptoLocker ransomware variant and taking a close look at the resulting VTI severity score. It’s worth noting (again!) the risks of relying on a single AV vendor for protection. When this sample was run against the AV engines in VirusTotal and Opswat Metadefender the detection rate was quite low – 4/57 for VirusTotal and 12/42 for Metadefender. Typically, AV detection rates rise over time but in that crucial golden window when new malware variants are propagating, there’s a good chance your AV may still be napping.
Next, to demonstrate custom rules that flag specific behavior we’ll build a rule and run it against pseudo-malcode we wrote as a proof of concept. The malcode attempts to detect and shut down another process we wrote that mimics what might run in a SCADA industrial control environment at a utility.
VTI rules are written in Python and are used to define behavior that should be detected by the VTI engine. The detection mechanism can flag both standalone behavior and dependent sequential behavior. Before we show you an example of a rule, we’ll first go through some key aspects of our VTI engine.
You can see a video here of the default VTI ruleset in action scoring malicious behavior in a Cryptolocker analysis:
The VTI Engine – Under the Hood
Traditional behavior-based malware detection systems usually depend on creating rules around API calls. However, the VTI engine makes use of a generic object and function call system.
This generic system parses a sequential log of analysis data provided by the VMRay Analyzer and maps known API calls and COM methods to one or more generic function calls (gfncalls). For example, all APIs which can be used to create a process (CreateProcess, CreateProcessInternal, NtCreateProcess, …) are mapped to one generic process creation function call (gfn_proc_create). Therefore, it isn’t necessary to know all APIs which can be used for process creation.
A single API call can also be mapped to multiple gfncalls, for example, URLDownloadToFile. This API call downloads data from the Internet and saves it to a file. Thus, the API is mapped to gfn_url_download and gfn_file_create.
In addition, a generic system object (gobject) represents an element of the OS like a file, driver, registry key, thread, process, or the like. It holds important values like the file name, driver name, registry key path, or process name and is linked to an operation mode. For example, you can create or open a file (ACCESS), read from a file (READ), write to a file (WRITE), or delete a file (DELETE).
This generic system greatly facilitates the creation of VTI rules because you can use one gfncall or gobject in your rule to detect a specific behavior instead of multiple API calls. That means if you create a VTI rule where you want to check if a certain file is accessed, you can simply check for a gob_file with the ACCESS operation.
Another important aspect is that the generic system is OS independent and will be used in future releases for other operating systems. The mapping behind this generic system is updated regularly (more APIs and COM methods mapped to gfncalls and corresponding gobjects) to ensure the VTI engine is flagging as much malicious activity as possible. There’s further reading on this topic in the appendix of this post.
VTI Rule
A VTI rule always consists of three hierarchical components:
- Category
- Operation
- Technique
The Category is a domain, a generic term that groups the underlying operations.
The Operation is a generic description of a behavior which can be achieved via different techniques.
The Technique is the core element of a VTI rule where the behavior to be detected by the VTI engine is described.
The concatenation of the components creates a hierarchical model to establish a clear and useful grouping for VTI rules.
In our example, the “OS” category has two operations: “Modify firewall” and “Disable crucial system tool”. Below the “Modify firewall” operation, we have two Techniques where the first one detects if a new firewall rule has been added and the second one detects if the firewall has been disabled. Below the “Disable crucial system tool” operation, we have three techniques. The first one checks if the command prompt (CMD) has been disabled, the second one checks if the registry editor has been disabled, and the last one checks if the task manager has been disabled.
As you can see, all three components (Category, Operation, and Technique) have a clear and useful grouping. Here’s an example of how to implement the “disable taskmgr” VTI rule:
The implementation consists of the three components: Category (lines 3 to 6), Operation (lines 8 to 9), and Technique (lines 11 to 29). The relationship between these components is inherited and follows the described hierarchy model. In the Technique section you can see one possibility to disable the Windows Task Manager via registry.
We provide several helper functions for easy test procedure creation (line 22) for all common actions like file, registry, window, memory, service, and many more. In the transition table (line 27), you can also define multiple entries to chain behavior. Internally, these chains are simple but powerful finite state machines. With this feature, it is possible to detect dependent sequential behavior: for example, a downloader which first downloads a PE file and executes it afterwards.
Once you create your VTI rule, you can start an analysis and check the VTI results tab for behavior detection results.
As mentioned earlier, we created some test malcode that looks to deactivate a process we wrote that mimics what you might see running in a SCADA environment at a utility – we called it ‘TurbineWatchdog’.
Let’s take a look at a video of the custom VTI rule creation and result:
VTI Rule Administration
You can manage each VTI rule (built-in and custom) in the VMRay Analyzer admin console. The properties of all the rules are defined in a VTI configuration. For flexibility, you can create an arbitrary number of configurations which can be selected dynamically for each analysis.
The VTI engine supports the following sample classes:
- Documents score: The score for all samples related to document types like PDF, all Microsoft Office Word documents, Excel documents, PowerPoint documents, and RTF.
- Scripts score: The score for JScript and VBScript samples.
- Browsers score: The score for all browser related samples like URLs.
- Default score: The score for all sample classes which do not belong to the documents, scripts, and browsers sample class like PE samples.
You’ll find the VTI Engine documentation with many VTI rule examples in your customer portal of course. Got questions? Feedback? Let us know!
Follow us on Twitter @VMRay to get updates on future blog posts like this.
Additional Links
https://www.vmray.com/analyzing-environment-sensitive-malware/
https://www.virustotal.com/en/file/c1e3bd722646570eef289dc5d15c0a4aae0d1e1e71dce9ff9bbd985a9296a187/analysis/1464732372/
https://www.metadefender.com/?_escaped_fragment_=/results/file/504cb165d92744f6983056bab7591569#!/results/file/504cb165d92744f6983056bab7591569
http://www.bitdefender.com/support/cryptolocker-ransomware-%E2%80%93-details-and-prevention-1204.html
https://en.wikipedia.org/wiki/Finite-state_machine
Appendix – API calls and parameter normalization
Besides generic function calls and generic objects, different APIs have different prototypes, i.e. parameters and parameter formats. When translating them to a gfn_call, all parameters are normalized and generalized. When writing rules, you do not have to care about the specifics of certain API abstraction layers, but only write ONE rule for ONE resulting gfn_call.
For example:
Besides the application which should be started, you can define several other properties of the new process. You can define how the process should be started (immediately, suspended, …), how the window of the application should be displayed (shown, hidden, …) and so on. All these different properties are stored via the mapping in the generic function call gfn_proc_create. The gfncall gfn_proc_create has the following layout:
Gfn_proc_create:
– process_obj
– thread_obj
– startup_flags
– desired_access
– show_window
– and many more …
Now let’s have a closer look at the “how the window of the application should be displayed” property. If you use the CreateProcess API, the value of this property is stored in the StartupInfo structure and will be mapped to the show_window member of the gfncall gfn_proc_create.
BOOL WINAPI CreateProcess(
_In_opt_ LPCTSTR lpApplicationName,
_Inout_opt_ LPTSTR lpCommandLine,
_In_opt_ LPSECURITY_ATTRIBUTES lpProcessAttributes,
_In_opt_ LPSECURITY_ATTRIBUTES lpThreadAttributes,
_In_ BOOL bInheritHandles,
_In_ DWORD dwCreationFlags,
_In_opt_ LPVOID lpEnvironment,
_In_opt_ LPCTSTR lpCurrentDirectory,
_In_ LPSTARTUPINFO lpStartupInfo,
_Out_ LPPROCESS_INFORMATION lpProcessInformation
);
This StartupInfo structure has the wShowWindow member.
typedef struct _STARTUPINFO {
DWORD cb;
LPTSTR lpReserved;
LPTSTR lpDesktop;
LPTSTR lpTitle;
DWORD dwX;
DWORD dwY;
DWORD dwXSize;
DWORD dwYSize;
DWORD dwXCountChars;
DWORD dwYCountChars;
DWORD dwFillAttribute;
DWORD dwFlags;
WORD wShowWindow;
WORD cbReserved2;
LPBYTE lpReserved2;
HANDLE hStdInput;
HANDLE hStdOutput;
HANDLE hStdError;
} STARTUPINFO, *LPSTARTUPINFO;
This member holds the value how the window of the new process should be displayed. One possibility would be the constant value SW_HIDE (0) to hide the window or SW_SHOW (5) to show the window.
Instead of the CreateProcess API, you can also use, for example, the ShellExecute API to create a new process.
HINSTANCE ShellExecute(
_In_opt_ HWND hwnd,
_In_opt_ LPCTSTR lpOperation,
_In_ LPCTSTR lpFile,
_In_opt_ LPCTSTR lpParameters,
_In_opt_ LPCTSTR lpDirectory,
_In_ INT nShowCmd
);
This API stores the exact same information in the 6th parameter (nShowCmd).
All APIs that create a process have a different data structure but our generic mapping can handle the different data structures and map the necessary properties to one generic data structure.