COM Introduction
The Component Object Model (COM) is quite an old technology that was introduced by Microsoft in the early 90s. It allows the development and usage of binary software components in a language and architecture independent way. To this end, COM classes are provided by COM servers and can be instantiated as COM objects by COM clients. The serving process may be the same as the instantiating one or they may be two different ones, even on different computers. Each COM class is uniquely identified by a 128-bit CLSID. Besides that, no knowledge about the implementation detail is needed by the client and all communication is handled via well-defined interfaces.
Although it is quite an old system, COM still forms an indispensable part of many new technologies, e.g., .NET, WinRT, and DirectX. Over the years, Microsoft has progressively introduced new COM interfaces for controlling various Windows components, such as the audio devices, the firewall, or the Internet Explorer. These interfaces co-exist along the way with the standard Windows API and are almost as powerful.
For malware, COM poses a valuable way to perform security related system operations in a stealthy way, because it is hard to monitor and analyze. Especially behavior based analyzers (“sandboxes”) struggle to deal with the related COM APIs and mostly are incapable of monitoring them at all. This stems from the tremendous inherent complexity of COM:
- COM methods can be executed in various places: in-process, out-of-process, via RPC, and even on remote systems (DCOM)
- COM classes have no easy-available export information
- COM server functions may be routed via proxy functions in custom proxy DLLs
- the implementation of COM functionality varies between different Windows versions and architectures
Most malware analyzers monitor the interaction between malware and its environment exclusively on WinAPI or NTAPI level. However, this abstraction level does not allow for an appropriate monitoring of COM communication. This is mainly caused by the huge amount of noise on that layer, which is hard to be distinguished from the actual malware behavior. If out-of-process COM objects are used, things are even worse. In order to track the different method invocations, the analyzer additionally needs to monitor the server process and effectively filter out all non-related calls within. This is almost impossible to manage in a generic way.
Therefore, to avoid being evaded by malware and analyze the related COM activity, it is required to monitor directly at the COM interface level. This is a difficult and laborious task. The interface of each utilized COM object needs to be known in advance, because it contains the description about the provided methods, their prototypes, and the related structures. During analysis, this information is needed to identify the actually called method and to retrieve all actual parameter values. To obtain this interface description, different sources have to be searched and parsed. Sometimes it is delivered in form of an architecture dependent typelib resource (either embedded in the executable or stored as an additional file). In other cases it can be gathered from an architecture independent Interface Definition Language (IDL) file.
Malware Using COM
The complexity of monitoring COM interaction provides an ideal opportunity for malware to conceal its malicious behavior. By utilizing COM instead of calling common Windows API functions, most sandboxes are blinded. In fact, none of the dynamic analyzers that we have tested so far dealt correctly with COM malware: either they generated incomplete and noisy analysis results or they failed completely.
Today there is a lot of malware in the wild that already utilizes COM, mostly for modifying registry keys or changing the firewall settings. Nevertheless, there are also more complex uses cases, such as instrumenting the Internet Explorer by using its COM interface. This approach has several advantages for malware:
- No need to care about network or security settings, e.g., proxy or firewall rules. Usually everything is set up for the Internet Explorer anyway.
- IE can be fully automated and instrumented, e.g., to navigate to a certain URL, download a file, or interact with the form fields of an HTML document.
- Everything can be easily hidden from the user. A new created IE window is invisible by default and if the browser already has been loaded into memory before, one additional instance is rather unsuspicious.
VMRay’s Comprehensive Function Log
We now take a brief look at the particular capabilities of VMRay Analyzer to handle such complex COM malware. In general, VMRay provides analysis data at many different abstraction levels and formats. The output ranges from machine parseable IOC lists in XML or JSON format to aggregated high level behavior descriptions in HTML. For illustrating the behavior of COM malware, we utilize its comprehensive function log, which comprises every single function call at the highest semantic level possible:
- COM methods are monitored on the COM interface layer,
- Win32API calls are monitored on the WinAPI layer,
- Native API calls are monitored on the NTAPI layer,
- direct system calls are monitored on the sysenter/syscall,
- and so on …
Only the 3rd generation monitoring approach can conserve the malicious high level semantic, which is normally completely lost when monitoring only on API level. Furthermore, suppressing the low level sub- and recursive function calls delivers a high information density of the resulting report, while the noise level is nearly zero. Finally, by interrupting malware execution much fewer times than traditional emulation/hooking-based approaches, the analysis process is tremendously faster.
The IcoScript Malware
In the following we present excerpts from VMRay’s analysis of the IcoScript malware utilizing COM to establish an invisible connection to the public Yahoomail web site, in order to retrieve command and control commands from it. A static analysis of this remote administration tool has been created before by Paul Rascagnères (G Data): https://www.virusbtn.com/virusbulletin/archive/2014/08/vb201408-IcoScript.
Initialization
First, the COM system is initialized by calling CoInitialize. After that, an Internet Explorer instance (CLSID 2df01-0000-00000-c000000000000046) is created with CoCreateInstance. Finally, it is ensured that the resulting instance is not visible by setting the Visible variable to false.
[0018.632] CoInitialize (pvReserved=0x0) returned 0x0
[0028.643] CoCreateInstance (in: rclsid=0x4075f8*(Data1=0x2df01, Data2=0x0, Data3=0x0, Data4=([0]==0xc0, [1]==0x0, [2]==0x0, [3]==0x0, [4]==0x0, [5]==0x0, [6]==0x0, [7]==0x46)), pUnkOuter=0x0, dwClsContext=0x17, riid=0x407608*(Data1=0xd30c1661, Data2=0xcdaf, Data3=0x11d0, Data4=([0]==0x8a, [1]==0x3e, [2]==0x0, [3]==0xc0, [4]==0x4f, [5]==0xc9, [6]==0xe2, [7]==0x6e)), ppv=0x18f20c | out: ppv=0x18f20c*=0x1effb0) returned 0x0
[0030.036] IUnknown:AddRef (This=0x1effb0) returned 0x2
[0030.036] InternetExplorer:IWebBrowserApp:put_Visible (This=0x1effb0, Visible=0) returned 0x0
Navigation
After that, the IE COM interface is used to load a custom URL. The caller only needs to initialize some variables (like the URL) and call the Navigate function. In our example, the malware first opens the about:blank page and then navigates to the French Yahoomail website http://mail.yahoo.fr. By calling get_Busy it waits until IE has completed the download operation.
[0030.040] InternetExplorer:IWebBrowser:Navigate (This=0x1effb0, URL="about:blank", Flags=0x18f1ac*(varType=0x3, wReserved1=0x18, wReserved2=0xf29c, wReserved3=0x18, varVal1=0x2, varVal2=0xffffffff), TargetFrameName=0x18f1bc*(varType=0x8, wReserved1=0x18, wReserved2=0xf1f4, wReserved3=0x18, bstrVal="_top"), PostData=0x18f1cc*(varType=0x8, wReserved1=0x74f3, wReserved2=0x58, wReserved3=0x1d3, bstrVal=0x0), Headers=0x18f1cc*(varType=0x8, wReserved1=0x74f3, wReserved2=0x58, wReserved3=0x1d3, bstrVal=0x0)) returned 0x0
[0038.280] InternetExplorer:IWebBrowserApp:put_Visible (This=0x1effb0, Visible=0) returned 0x0
[0046.283] InternetExplorer:IWebBrowserApp:put_Visible (This=0x1effb0, Visible=0) returned 0x0
[0046.284] InternetExplorer:IWebBrowser:get_Busy (in: This=0x1effb0, pBool=0x18f22c | out: pBool=0x18f22c*=0x0) returned 0x0
[0046.291] InternetExplorer:IWebBrowser:Navigate (This=0x1effb0, URL="http://mail.yahoo.fr", Flags=0x18ed58*(varType=0x3, wReserved1=0x18, wReserved2=0xedbc, wReserved3=0x18, varVal1=0x2, varVal2=0xffffffff), TargetFrameName=0x18ed68*(varType=0x8, wReserved1=0x18, wReserved2=0xeda0, wReserved3=0x18, bstrVal="_top"), PostData=0x18ed78*(varType=0x8, wReserved1=0x74f3, wReserved2=0x378, wReserved3=0x1d3, bstrVal=0x0), Headers=0x18ed78*(varType=0x8, wReserved1=0x74f3, wReserved2=0x378, wReserved3=0x1d3, bstrVal=0x0)) returned 0x0
[0054.379] InternetExplorer:IWebBrowserApp:put_Visible (This=0x1effb0, Visible=0) returned 0x0
[0062.402] InternetExplorer:IWebBrowserApp:put_Visible (This=0x1effb0, Visible=0) returned 0x0
[0062.414] InternetExplorer:IWebBrowser:get_Busy (in: This=0x1effb0, pBool=0x18f22c | out: pBool=0x18f22c*=0x0) returned 0x0
Interaction with the Website
After navigation has completed, the malware retrieves the resulting document’s IDispatch interface by calling get_Document. This one is used to get an HTML interface (IHTMLDocument2, IID 332c4425-26cb-11d0-b483-00c04fd90119) and then to fetch all HTML elements by calling the get_all method.
[0062.419] InternetExplorer:IWebBrowser:get_Document (in: This=0x1effb0, ppDisp=0x18fea8 | out: ppDisp=0x18fea8*=0x2cab5c) returned 0x0
[0062.420] IUnknown:QueryInterface (in: This=0x2cab5c, riid=0x407618*(Data1=0x332c4425, Data2=0x26cb, Data3=0x11d0, Data4=([0]==0xb4, [1]==0x83, [2]==0x0, [3]==0xc0, [4]==0x4f, [5]==0xd9, [6]==0x1, [7]==0x19)), ppvObject=0x18fea0 | out: ppvObject=0x18fea0*=0x301e04) returned 0x0
[0062.430] IHTMLDocument2:get_all (in: This=0x301e04, p=0x18fea4 | out: p=0x18fea4*=0x301f24) returned 0x0
[0062.442] IUnknown:Release (This=0x2cab5c) returned 0x1
This method returns the IHTMLElementCollection interface, which is then used to retrieve the dispatch interface of each single item and in a second step their individual interfaces. To traverse all existing elements and their children, the malware then queries IHTMLElementCollection (IID 3050f21f-98b5-11cf-bb82-00aa00bdce0b) first and IHTMLElement (IID 3050f1ff-98b5-11cf-bb82-00aa00bdce0b) after that. In the example below, the first query operation fails and the second one succeeds: the malware has identified the email address field and now simulates user input. First, it simulates a mouse click, which causes the HTMLFrameSiteEvents::onclick event to fire. After that, the put_value function of IHTMLInputTextElement (IID 3050f2a6-98b5-11cf-bb82-00aa00bdce0b) is used to simulate keyboard input and fill data into the corresponding HTML form element.
[0062.442] IHTMLElementCollection:item (in: This=0x301f24, name=0x18ec38*(varType=0x8, wReserved1=0x0, wReserved2=0x28f, wReserved3=0x0, bstrVal="username"), index=0x18ec48*(varType=0x0, wReserved1=0x2f, wReserved2=0xf328, wReserved3=0x2f, varVal1=0x2ff328, varVal2=0x301d58), pdisp=0x18fea8 | out: pdisp=0x18fea8*=0x30208c) returned 0x0
[0062.445] IUnknown:QueryInterface (in: This=0x30208c, riid=0x409220*(Data1=0x3050f21f, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18edac | out: ppvObject=0x18edac*=0x0) returned 0x80004002
[0062.445] IUnknown:QueryInterface (in: This=0x30208c, riid=0x4091b8*(Data1=0x3050f1ff, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18ec30 | out: ppvObject=0x18ec30*=0x302164) returned 0x0
[0062.450] IHTMLElement:get_parentElement (in: This=0x302164, p=0x18ec38 | out: p=0x18ec38*=0x302284) returned 0x0
[0062.463] IUnknown:Release (This=0x302284) returned 0x0
[0062.463] IUnknown:Release (This=0x302164) returned 0x1
[0062.465] IUnknown:QueryInterface (in: This=0x30208c, riid=0x407638*(Data1=0x3050f1ff, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18edb4 | out: ppvObject=0x18edb4*=0x302164) returned 0x0
[0062.465] IHTMLElement:click (This=0x302164) returned 0x0
[0063.514] IUnknown:QueryInterface (in: This=0x30208c, riid=0x409330*(Data1=0x3050f2a6, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18eda4 | out: ppvObject=0x18eda4*=0x3022cc) returned 0x0
[0063.536] IHTMLInputTextElement:put_value (This=0x3022cc, value="adffret@yahoo.com") returned 0x0
[0063.540] IUnknown:Release (This=0x302164) returned 0x2
[0063.540] IUnknown:Release (This=0x3022cc) returned 0x1
[0063.542] IUnknown:Release (This=0x30208c) returned 0x0
To complete the login process, the same procedure is executed for the password field.
Additionally, the persistent checkbox is unchecked by calling the put_checked function of the IHTMLOptionButtonElement interface with an appropriate value:
[0064.710] InternetExplorer:IWebBrowser:get_Document (in: This=0x1effb0, ppDisp=0x18fea8 | out: ppDisp=0x18fea8*=0x2cab5c) returned 0x0
[0064.710] IUnknown:QueryInterface (in: This=0x2cab5c, riid=0x407618*(Data1=0x332c4425, Data2=0x26cb, Data3=0x11d0, Data4=([0]==0xb4, [1]==0x83, [2]==0x0, [3]==0xc0, [4]==0x4f, [5]==0xd9, [6]==0x1, [7]==0x19)), ppvObject=0x18fea0 | out: ppvObject=0x18fea0*=0x301e04) returned 0x0
[0064.710] IUnknown:Release (This=0x301e04) returned 0x2
[0064.710] IUnknown:Release (This=0x301f24) returned 0x0
[0064.711] IHTMLDocument2:get_all (in: This=0x301e04, p=0x18fea4 | out: p=0x18fea4*=0x301f24) returned 0x0
[0064.713] IUnknown:Release (This=0x2cab5c) returned 0x1
[0064.713] IHTMLElementCollection:item (in: This=0x301f24, name=0x18ec38*(varType=0x8, wReserved1=0x0, wReserved2=0x0, wReserved3=0x0, bstrVal="persistent"), index=0x18ec48*(varType=0x0, wReserved1=0x0, wReserved2=0x0, wReserved3=0x0, varVal1=0x0, varVal2=0x0), pdisp=0x18fea8 | out: pdisp=0x18fea8*=0x302284) returned 0x0
[0064.715] IUnknown:QueryInterface (in: This=0x302284, riid=0x409220*(Data1=0x3050f21f, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18edac | out: ppvObject=0x18edac*=0x0) returned 0x80004002
[0064.715] IUnknown:QueryInterface (in: This=0x302284, riid=0x4091b8*(Data1=0x3050f1ff, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18ec30 | out: ppvObject=0x18ec30*=0x30235c) returned 0x0
[0064.718] IHTMLElement:get_parentElement (in: This=0x30235c, p=0x18ec38 | out: p=0x18ec38*=0x3023a4) returned 0x0
[0064.724] IUnknown:Release (This=0x3023a4) returned 0x0
[0064.726] IUnknown:Release (This=0x30235c) returned 0x1
[0064.727] IUnknown:QueryInterface (in: This=0x302284, riid=0x409310*(Data1=0x3050f2bc, Data2=0x98b5, Data3=0x11cf, Data4=([0]==0xbb, [1]==0x82, [2]==0x0, [3]==0xaa, [4]==0x0, [5]==0xbd, [6]==0xce, [7]==0xb)), ppvObject=0x18ed9c | out: ppvObject=0x18ed9c*=0x30211c) returned 0x0
[0064.733] IHTMLOptionButtonElement:put_checked (This=0x30211c, checked=0) returned 0x0
[0064.740] IUnknown:Release (This=0x30211c) returned 0x1
[0064.742] IUnknown:Release (This=0x302284) returned 0x0
After unchecking the persistent login option, the sample repeatedly tries to retrieve the .save element, which fails in all cases:
[0064.742] InternetExplorer:IWebBrowser:get_Document (in: This=0x1effb0, ppDisp=0x18fea8 | out: ppDisp=0x18fea8*=0x2cab5c) returned 0x0
[0064.742] IUnknown:QueryInterface (in: This=0x2cab5c, riid=0x407618*(Data1=0x332c4425, Data2=0x26cb, Data3=0x11d0, Data4=([0]==0xb4, [1]==0x83, [2]==0x0, [3]==0xc0, [4]==0x4f, [5]==0xd9, [6]==0x1, [7]==0x19)), ppvObject=0x18fea0 | out: ppvObject=0x18fea0*=0x301e04) returned 0x0
[0064.743] IUnknown:Release (This=0x301e04) returned 0x2
[0064.743] IUnknown:Release (This=0x301f24) returned 0x0
[0064.743] IHTMLDocument2:get_all (in: This=0x301e04, p=0x18fea4 | out: p=0x18fea4*=0x301f24) returned 0x0
[0064.745] IUnknown:Release (This=0x2cab5c) returned 0x1
[0064.745] IHTMLElementCollection:item (in: This=0x301f24, name=0x18ec38*(varType=0x8, wReserved1=0x0, wReserved2=0x0, wReserved3=0x0, bstrVal=".save"), index=0x18ec48*(varType=0x0, wReserved1=0x0, wReserved2=0x0, wReserved3=0x0, varVal1=0x0, varVal2=0x0), pdisp=0x18fea8 | out: pdisp=0x18fea8*=0x0) returned 0x0
The layout of the web page has changed in the meantime and, hence, the malware is no longer able to continue at this point. As we know from older reports, the intended behavior is to login in and check the existing emails for further commands from its C&C master.
Conclusion and Summary
This short analysis demonstrates how easy VMRay Analyzer can be utilized to also monitor complex COM malware. Nowadays we see more and more of such threats that utilize COM in order to conceal their real behavior and evade dynamic analysis. Since VMRay’s 3rd generation technology intercepts all interaction between the malware and its environment always on the highest level possible, monitoring COM activity is no different than monitoring simple Win32 APIs. This approach has unrivalled advantages over classical 1st and 2nd generation systems:
- the resulting analysis covers all function calls, e.g., COM, Win32 API, NTAPI, or direct syscalls
- the high level semantic of malware behavior is always preserved and not broken down to fine granular, but unrelated, API calls
- the information density of the resulting analysis reports is very high and the noise level is zero
- the analysis process is very fast and scalable due to only a very few interrupts of the malware execution
Update: a few weeks ago Hexacorn already posted an article about malware utilizing COM for analyzer evasion.