Malware authors are continuously evolving their tactics to evade detection by security tools, and sandbox evasion techniques are a critical component of this cat-and-mouse game. In this comprehensive article, we’ll delve into the intricate details of the three primary categories of sandbox evasion techniques employed by modern malware, shedding light on the sophisticated methods used by threat actors to bypass these analysis environments. By understanding these defense evasion tactics , security professionals can stay one step ahead and fortify their defenses against advanced persistent threats.
The use of malware analysis sandboxes as the silver bullet against advanced, persistent threats became popular over a decade ago. Back then, malware authors had already found ways to evade tools based on static analysis (such as traditional antivirus software products) using techniques such as polymorphism, metamorphism, encryption, obfuscation and anti-reversing protection. As a result, advanced malware analysis sandboxes are now considered the last line of defense against advanced threats .
The operating principle of a sandbox is simple – determine if a file is malicious or not based on its observed behavior in a controlled environment. The sandbox allows the malware to perform all of its malicious operations and records the resulting behavior. After some time, the analysis is stopped and the result is examined and scanned for typical malicious behavior patterns. Since detection is not based on signatures, sandboxes can even detect zero-day and targeted malware (which typically has never been seen before by security researchers or analyzed in an antivirus lab).
Obviously, behavior-based malware detection only works if the observed file actually performs malicious operations during its analysis. If – for whatever reason – no harmful operations are executed during the analysis, the sandbox concludes that the file under examination is benign. Malware authors are always looking for new, innovative ways to evade sandbox detection by concealing the real behavior of malware. We’ve grouped these approaches into three categories:
Sandbox Detection: Detecting the presence of a sandbox (and only showing benign behavior patterns on detection)
Exploiting Sandbox Gaps: Exploiting weaknesses or gaps in sandbox technology or in the ecosystem
Context-Aware Malware: Using time/event/environment-based triggers (that are not activated during sandbox analysis)
Sandbox Detection
This first approach detects the presence of a sandbox by looking for small differences between a sandbox environment and a real victim’s system. If a sandbox is detected, malware usually reacts in one of two different ways: it either terminates immediately (which is in itself suspicious) or it shows non-malicious behavior and performs only benign operations. An example of this is shown in Figure 1, where the sample: attempts to detect if it is running inside a virtual machine (VM) and looks to see if there is an application sandbox running (Sandboxie).
We can see this in the VMRay Threat Identifier (VTI) details showing that VMRay identified the sandbox detection attempts and scored this behavior as highly malicious.
Exploiting Sandbox Gaps
The second category of malware evasion techniques directly attacks and exploits weaknesses in the underlying sandbox technology or in the surrounding ecosystem. For example, we have recently seen a large volume of malware using Microsoft COM internally because most sandboxes cannot correctly analyze such samples.
Other malware will use obscure file formats that cannot be handled by the sandbox or they exploit the sandbox’s inability to process files that exceed a certain size. In Figure 2 we can see an example of malware ‘blinding the monitor’. Meaning it is – that is performing illegitimate API usage. This can be an effective method to hide from malware sandboxes that rely on a hook or driver injected into the target machine. However, since VMRay Platform does not use hooking, the evasion attempt was detected.
Context-Aware Malware
The third category represents malware that does not try to detect or attack the sandbox at all – Context-Aware Malware . Instead, it exploits the natural shortcomings of such automated systems. Because of the high volumes of unique malware seen in most environments, sandbox analysis systems usually only spend a few minutes on each file. Thus, by delaying the execution of a malicious payload by a certain amount of time, malware can remain undetected. Besides time-triggers, malware can also use other events that usually do not occur in a sandbox, e.g. a system reboot or user interaction. Additionally, the malware may be looking for specific artifacts present on the intended target machine, such as an application or localization setting.
In Figure 3, we see an analysis where the malware, in addition to attempting to detect a VM environment, engaged in ‘persistence’, installing startup scripts and applications to survive reboot:
Detecting the Presence of a Sandbox
There are a number of techniques to identify the existence of a sandbox. Once detected, the malware can react in different ways. The simplest step is to immediately terminate. This can raise a red flag since this is not the behavior of a normal, benign program. Another action is to show a bogus error message. For example, the malware may display a message that a certain system module is missing or the executable file has been corrupted. More sophisticated malware may perform some benign operations to conceal the real intention. Let’s take a deeper look into the different techniques used by malware in the wild to detect if it is being executed in a sandbox:
Detecting Virtualization/Hypervisor
This is one of the oldest malware evasion techniques . However, it is less relevant today as many production environments (workstations and servers) are virtualized anyway and virtual machines (VMs) are no longer only used by researchers and malware analysts. The earliest approach detected technical artifacts that existed due to the lack of full hardware support for virtualization (Paravirtualization). These techniques include:
Detecting artifacts of popular VM hypervisors, e.g. VMWare (“port 0x5658” ) or Virtualbox via a backdoor (“invalid opcode” ).
Detecting generic hypervisor artifacts: The most famous one is redpill (“IDTR could not be virtualized” )
These techniques are not very effective today. With hardware virtualization support, there are very few visible artifacts (if any) inside the VM since most hardware aspects are now virtualized and handled by the CPU itself. Therefore, they do not have to be simulated by the hypervisor.
Another approach that is still relevant today is detecting the implementation artifacts of the hypervisor
For example: Reveal vendor from MAC address, device IDs or the CPU ID or from the existence of certain processes, files, drivers, registry keys or strings in memory
We’ve published two analyses demonstrating a couple of types of virtualization detection. In the first, we see an attempt to detect if the malware is running inside VirtualPC:
Another approach is to detect the presence of a VM by looking at registry values. In this example, the malware queried the registry key “HKEY_LOCAL_MACHINE\SOFTWARE\” to look for values associated with common VM implementations like VMWare:
Detecting Sandbox Artifacts
In this approach, it is not the hypervisor that the malware is trying to detect, but the sandbox itself. This can be done either by two of the following techniques:
Using vendor-specific knowledge
Common VM products. For example, the existence of certain files, processes, drivers, file system structure, Windows ID, or username
The ecosystem. For example, mechanisms to revert the analysis environment back to a clean state after infection (Deepfreeze , Reborn Cards). In addition, performing communication with the sandbox controller (additional listening ports and the specific network environment.
Using certain sandboxing technologies for detection
Most sandboxes use hooks (i.e. they inject or modify code and data within the analysis system). The ‘hook’ is essentially a shim layer capturing the communication between processes, drivers, and the OS. A hook can be implemented in many ways such as inline hooks, IAT, EAT, proxy DLL, or filter drivers etc. This makes them detectable by explicitly inspecting certain instructions or pointers or verifying the integrity of the system (verifying hash signatures of relevant system files).
Other malware sandboxes use emulation, which comes with side-effects and small differences compared to a native system. This includes different instruction semantics and cache-based attacks. Emulation gaps can be detected by invoking an obscure CPU instruction that was not included in the emulation. When the call fails, malware will know it is running in an emulated environment.
An example of vendor-specific detection can be seen in Figure 3, where the malware looks for the presence of the module ‘SbieDll.dll” – an indicator that it would be running in under Sandboxes, a common sandboxing environment:
Detecting An Artificial Environment
Sandboxes are usually not production systems but specifically set up for malware analysis . Hence, they are not identical to real computer systems and these differences can be detected by malware. Differences may include:
Hardware properties
Unusually small screen resolution, no USB 3.0 drivers, lack of 3D rendering capabilities, only one (V)CPU, small hard disk and memory sizes
Software properties
Atypical software stack, e.g. no IM, no mail client
System properties
Uptime (“system was restarted 10 seconds ago ”), network traffic (“system uptime is days, but only a few MB have been transmitted over the network ”), no or only default printers installed
User properties
Clean desktop, clean filesystem, no cookies, no recent files, no user files
To demonstrate, we’ll go back to the same analysis we looked at in Figure 2. In addition to checking for VM presence, the malware is looking for the presence of Wine , a software emulator (that is, it emulates Windows functions, rather than CPU emulation). We can see here in the VTI Score that the malware is doing a query, GET_PROC_ADDRESS and attempting to determine from the returned result if it what would be expected in a Wine environment:
Timing Based Detection
Monitoring the behavior of an application comes with a timing penalty, which can be measured by malware to detect the presence of a sandbox. Sandboxes try to prevent this by faking the time. However, malware can bypass this by incorporating external time sources such as NTP.
In Figure 5 you can see an example of timing-based detection. The VMRay Analyzer Report shows that the sample checked for rdtsc, the time-stamp counter.
How to Defeat These Sandbox Evasion Techniques
In order to evade these types of detection by malware, an analysis environment should:
Do not rely on modifying the target environment
In particular, a common approach for sandbox analysis is hooking. That presence of a hook (the injected user-mode or kernel-level driver that monitors and intercepts API calls and other malware activity) is a telltale sign for malware. It is virtually impossible to completely hide the presence of a hook.
Either implement full system emulation perfectly or not at all
While a perfectly-implemented emulation environment will be, in theory, difficult to detect, this is a complex undertaking. Just as all software has bugs, it’s a near certainty that any given emulation environment will have flaws that can be detected.
Use a target analysis environment that is ‘real’
If the malware sandbox can run an image copied from actual production endpoints, then the risk of detection falls dramatically. Coupling that with randomization of the environment helps to ensure that there are no tell-tale signs for malware to identify the target environment as ‘fake’
VMRay’s technology ensures that there is a minimal attack surface for malware to detect it is running in a sandbox. By not modifying the target environment, not relying on emulation, and allowing real-world images to run as target environments, VMRay gives nothing for malware to flag as a sandbox environment.
Evading Malware Sandboxes: Exploiting Gaps in Analysis Environments
We wrote that the use of malware sandboxes as the silver bullet against advanced, persistent threats became popular over a decade ago. Back then, malware authors had already found ways to evade tools based on static analysis (such as traditional antivirus software products) using techniques such as polymorphism, metamorphism, encryption, obfuscation and anti-reversing protection. Malware analysis sandboxes doing behavior-based detection are now considered the final layer of defense against advanced threats .
Obviously, behavior-based malware detection only works if the observed file actually performs malicious operations during its analysis. If – for whatever reason – no harmful operations are executed during the analysis, the sandbox concludes that the file under examination is benign. In the second part of the series, we did a deep dive into how malware can directly detect the presence of a sandbox environment. Let’s now look at how malware can exploit gaps in the sandbox environment, rather than explicitly detecting the presence of a sandbox.
Exploiting Sandbox Technology Weaknesses
Explicitly searching for the existence of a sandbox can be detected as a suspicious activity during analysis. A more advanced approach for malware, therefore, exploits weaknesses in the sandbox technology to perform operations without being detected. By exploiting these sandbox weaknesses, malware does not have to worry about being detected even if it is being executed in a sandboxed system. Some of the techniques include:
Blinding the Monitor
Most sandboxes do in-guest-monitoring, (i.e., they place code, processes, and/or hooks) inside the analysis environments. If these modifications are undone or circumvented, the sandbox is blinded – in other words, visibility into the analyzed environment is lost. This blinding can take the following forms:
Hook Removal
Hooks can be removed by restoring the original instruction or data.
Hook circumvention
Hooks can be circumvented by using direct system calls instead of APIs, calling private functions (which are not hooked), or performing unaligned function calls (skipping the “hook code”). We can see an example of this in Figure 1 where illegitimate API usage is utilized by the malware. While hooks could solve this problem for these particular internal functions, there are many of these present in the operating system and they vary with each Windows version. Furthermore, the problem of unaligned function calls cannot be adequately solved by hooking.
System file replacement
Hooks usually reside in the system files that are mapped into memory. Some malware will unmap those files and reload them. The newly loaded file version is then “unhooked”.
Kernel code
Many sandboxes are not capable of monitoring kernel code or the boot process of a system.
Obscure file formats
Many sandboxes do not support all file formats. Powershell, .hta, and .dzip are examples of some file formats that may slip by and simply fail to execute in a sandbox environment.
Many sandboxes do not support all technologies
While the initial infection vector (say, a Word document with a macro) may open and the macro run in the sandbox, the macro will download and run a payload that uses an obscure technology hidden from the analysis. COM , Ruby , ActiveX , JAVA are some examples that we’ve analyzed in previous blog posts.
Operating system reboots
Many sandboxes cannot survive a reboot. Some systems try to emulate a reboot by re-logging in the user. This can be detected, however, and not all triggers of a reboot are executed.
Blinding the Ecosystem
By simply overwhelming the target analysis environment, malware can also avoid analysis with this crude but sometimes effective approach. For example,
Some sandboxes only support files up to a certain size, e.g. 10 MB
Others don’t support multiple compression layers
How to Defeat These Sandbox Evasion Techniques
In order to ensure malware cannot evade analysis by these methods a sandbox analysis environment should:
Do not rely on modifying the target environment.
In particular, a common approach for sandbox analysis is hooking. That presence of a hook (the injected user-mode or kernel-level driver that monitors and intercepts API calls and other malware activity) gives malware the opportunity to disable analysis.
Run gold images as target analysis environments.
For efficiency and convenience, many sandboxes have a ‘one size fits all’ approach. A single type of target environment is used for all analyses. A better approach is to use the actual gold images (that is, the standard and server OS and application configurations that your enterprise uses) as the target environment. That way, you can be assured that any malware that is targeting your enterprise and could run on your desktops or servers will also run in the analysis environment.
Monitor all malware-related activity, regardless of application or format.
Some malware sandboxes, particularly those using a hooking-based approach, take shortcuts and compromises for the sake of efficiency in determining what activity is monitored. This can leave blind spots.
VMRay’s technology accommodates all these scenarios. When used in conjunction with using real-world VM images as the target analysis machines, VMRay Analyzer will give full visibility into malware activity, regardless of attempts by the malware to obfuscate its intentions.
Context-Aware Evasion: Time, Event and Environment Triggers in Sandbox Analysis
This is our final part in a series on sandbox evasion techniques used by malware today. We started with a primer, and then covered the two main categories of evasion techniques sandbox detection, and exploiting malware sandbox gaps.
In this part, we will be highlighting the context-aware evasion techniques that: use time, event, and environment-based triggers that are activated during sandbox analysis).
Context-Aware Evasion Techniques
This category of malware evasion techniques , like exploiting sandbox technology gaps, does not try to detect a sandbox. Nor does it try to conceal malicious behavior by circumventing a sandbox or exploiting a sandbox’s weaknesses.
Instead, it delays or postpones its malicious payload until a certain trigger/event occurs. The trigger that is chosen is very unlikely to be activated inside a sandbox. Triggers can be grouped into four categories:
Time Bombs
One of the most common techniques is to delay execution for a certain amount of time since sandboxes usually run samples only for a few minutes. As with many other evasion techniques, the utilization of time bombs, in particular, is an ongoing cat and mouse game: the malware goes asleep, the sandbox tries to detect sleep and shorten the time, malware detects shortened time, the sandbox tries to hide time forward by also updating system timers and so on. Time bomb techniques include:
Simple to very complex sleeps (e.g. concurrent threads that watch each other or are dependent on each other)
Executing only at a certain time or on a specific date (e.g. Monday at 12 AM or the 12th of March)
Slowing down execution significantly. An example of this is injecting millions of arbitrary system calls that have no effect except to slow down execution, especially when being executed in a monitored or emulated environment.
For example, Figure 1 shows a Pafish test running VM detection of certain artifacts that often exist in analysis environments. Note the timestamp checks. Malware will also run checks like this and if a difference is found in the counters, shut down on the assumption it is running inside an analysis environment.
System Events
The malicious software only becomes active only on shutdown, after reboot, or when someone logs on or off. Figure 2 shows an example of this, where a second-stage payload is pulled down only after a reboot. We can see in the VTI score that an executable is installed by the malware (the initial payload) that will run automatically on startup after reboot. It’s this startup process that fetches the second payload.
User Interaction
Waiting for mouse movements (but not too fast since this is a sandbox) or keyboard input.
Interacting with certain applications, e.g. browser, Email, Slack, or an online banking application.
Fake installers: Malware only becomes active after a user has clicked multiple buttons and checked various checkboxes (See Figure 2).
Office documents with malicious embedded content: The malicious code only becomes active when the user scrolls down (to see it) or clicks on it.
Detect a Specific Target System
Sophisticated targeted malware only works on the intended target system. The identification is usually based on the current username, time zone, keyboard layout, IP address, or some other system artifacts. The check itself can be done in various ways, ranging from simple to very complex methods.
Simple checks include string checks.
Complex checks (e.g. decryption with hash taken from the environment settings) are nearly unbreakable if the expected target environment is not known.
The malware will only proceed to the second stage (that downloads the main payload) if it determines it is in the expected target environment.
Related to this is the inverse scenario where the malware detects that the environment is most likely an artificial analysis environment. This can be the result of checks such as:
If network usage statistics of the system are too low, then don’t do anything
If ‘recently used documents’ are almost empty, then don’t do anything
If a number of processes are < x, then don’t do anything.
How to Defeat Context-Aware Evasion Techniques
Of the three categories of sandbox evasion techniques we have blogged about, context-aware malware is the least sensitive to the underlying malware sandbox technology . As sandbox technology improves and finds ways to circumvent sandbox detection , environmental triggers will become increasingly important to malware authors.
It is critical for security teams to ensure they are using target analysis environments that accurately replicate in every detail the actual desktop and server environments they are protecting. Furthermore, as we wrote previously, it’s important to have pseudo-random attributes as part of the target analysis environment.
Generic sandboxes running identical standard target environments are no longer sufficient. Further, the analysis environment needs to be able to detect environment queries and identify hidden code branches. VMRay Analyzer has the ability to randomize analysis environments, including when desktop or server gold images are used as the targets. Additionally, VMRay Analyzer will flag when malware is making environment queries. Combined, these ensure that security teams get the full picture and know when they are dealing with context-aware malware.