Test Your DFIR Tools: Sysmon Edition
As a defender I am continuously testing, tuning and re-testing a plethora of detection ideas across many complementary detection frameworks. However, a skilled DFIR practitioner values the confidence gained from cross-validating one tool's results with those produced by similar tools. Over the past nine months I have spent significant time researching new obfuscation and evasion techniques, and a good portion of this time I have spent validating the effects of these techniques on numerous detection artifacts and tool sets. This blog post highlights a bug I found in Sysmon's event logging that contaminates process command line argument logging and adversely affects at least two different tools used for viewing Windows event logs.
First of all, I have been a fan of using Sysmon in my personal testing lab setup since its original release in 2014. Sysmon (System Monitor) is part of Microsoft's Sysinternals Suite and was written by Mark Russinovich (@markrussinovich) -- thanks, Mark! The Sysmon driver installs as a service and logs numerous Windows events to the Microsoft-Windows-Sysmon/Operational event log. Most recently updated on January 5, 2018, v7.01 supports twenty-two different Event IDs ranging from process execution events (EID 1 & 5), network connection events (EID 3), image load events (EID 7), named pipe events (EID 17 & 18), WMI events (EID 19, 20 & 21), all the way to registry events and much more!
Over the years there have been numerous blog posts written on using Sysmon as a data collection source for endpoint visibility and threat hunting. In addition, Sysmon configurations such as @SwiftOnSecurity's sysmon-config project (https://github.com/SwiftOnSecurity/sysmon-config) have popularized the filtering capabilities that Sysmon supports for data collection tuning. Finally, Microsoft recently published the Sysinternals Sysmon Suspicious Activity Guide (https://blogs.technet.microsoft.com/motiba/2017/12/07/sysinternals-sysmon-suspicious-activity-guide/) which serves as an even better overview than what I am attempting to convey here.
At the end of the day, an increasing number of defenders rely on Sysmon for some level of endpoint visibility and it is worth cross-comparing Sysmon as a data source with similar but officially supported (by Microsoft) data sources before one begins or continues investing detection logic applied to data logged and collected from Sysmon. In particular, it is encouraged to test both the data source and the tooling used to query, aggregate and analyze the data source in question.
Let's begin!
Take the following example that sets a command in an environment variable envVar and then executes the contents of that environment variable in a child process:
cmd.exe /c "set envVar=echo TESTING&&cmd.exe /c %envVar%"
When viewing this Sysmon EID 1 event in EventVwr.exe you will notice the CommandLine field has replaced the single percent signs with duplicate percent signs:
I initially found this strange but did not think much of it until I started reviewing logs for payloads that used randomly-named environment variables, particularly variable names beginning with an integer like the following (simply replacing envVar with 1337 for the environment variable name):
cmd.exe /c "set 1337=echo TESTING&&cmd.exe /c %1337%"
The result of this command in EventVwr.exe was surprising to say the least:
There seems to be an escaping bug with percent signs in Sysmon EID 1's CommandLine field that is rendering incorrect data when viewed with EventVwr.exe.
The next step I took was to validate this data source with another tool, so I naturally turned to PowerShell and its Get-WinEvent cmdlet and queried this Sysmon event. It too returned the same incorrect value.
I then started testing cmd.exe /c "echo %1"
, cmd.exe /c "echo %2"
, cmd.exe /c "echo %3"
, etc. to see what additional values were returned in the rendered events. Sometimes the values correlated to the n'th value in the remainder of the event log (i.e. %2 --> ProcessGuid, %3 --> ProcessId, etc.), but other times the values were seemingly random system messages.
Recalling cmd.exe's command line argument limit of 8,191 characters I then wondered: Does the CommandLine field in the Sysmon event log enforce this same character limit?
To test this I executed the following test command using cmd.exe's || operator which is the opposite of && in that it only runs the remainder of the command if the previous command FAILS, so it is a convenient way to append garbage text to a command that should never actually execute:
cmd.exe /c "echo PUT_EVIL_COMMANDS_HERE||%1"
Below you can see the incorrect CommandLine value rendered by EventVwr.exe and PowerShell's Get-WinEvent cmdlet:
So this user-controlled input of "%1" (2 characters) produces as output in the event log "Incorrect function." (19 characters). So running fifty instances of "%1" (cmd.exe /c "echo PUT_EVIL_COMMANDS_HERE||%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1%1"
) produces:
Taking this just a few steps further from 50 to 85 instances of "%1" seems to do the trick, rendering different jibberish upon each viewing of the given Sysmon EID 1 event via EventVwr.exe, sometimes even displaying a "complete" log from another event log source (like a PowerShell EID 800 event):
When querying these events with PowerShell's Get-WinEvent we get the following error for each individual event with this level of padding (i.e. not returning any event information for these affected events but properly returning all "normal" events with their respective event data):
These results hold true on Windows 7 through Windows 10, as well as PowerShell versions 2 through 5.
However, continuing this validation testing I found other tools that parse .evt/.evtx files that correctly render these percent + integer padding values. For example, Microsoft's Log Parser and FireEye's Redline forensic analysis tools (both free!) are not duped by Sysmon's lack of proper escaping for the CommandLine field but properly render the correct value of "%1", "%1337", etc.
So what's the point? Well, an attacker can effective bury/hide a malicious process execution events in Sysmon EID 1 by padding enough percent + integer strings onto the command if the defender is using an affected analysis tool like EventVwr.exe or PowerShell's Get-WinEvent to query the events.
If you as a defender rely on EventVwr.exe or PowerShell's Get-WinEvent cmdlets to query Sysmon EID 1 events as your source of process execution events then you will want to handle these exceptions (try/catch block will be your friend in the PowerShell scenario) and parse the data with an unaffected tool when these exceptions arise. Alternatively, process command line arguments can be obtained through the officially supported Security event log's EID 4688.
Additionally, after testing your tool set to ensure it is not affected, looking for high counts of these bogus percent + integer values becomes an interesting signal for potential evasion attempts.
In conclusion: Do I still think Sysmon is really awesome? Yes, you bet! Would I make it my sole source of process execution event visibility? No, I would use Security EID 4688 or another officially supported mechanism for capturing command line arguments in real-time. But if I had to use Sysmon EID 1 until I migrated to something else (or until this bug is fixed) then I would test the tools I use to query these logs to ensure they properly parse Sysmon's unescaped percent characters so I do not miss events or write detection rules based off of this improper escaping.
As responsible defenders let's keep testing, tuning, validating, re-validating and breaking assumptions as we continually refine our detection capabilities and gain well-founded confidence in the tools we use to accomplish our goals.
NOTE: I first came across this bug in September 2016 during my PowerShell obfuscation research while testing output from the Invoke-Obfuscation framework's cmd.exe LAUNCHER option. Before releasing Invoke-Obfuscation I modified this LAUNCHER option to not accidentally reproduce this issue so its discovery might be prolonged while the Sysmon bug was serviced. I reported this bug to the appropriate team at Microsoft in September 2016 and once again in June 2017. I received confirmation for both email notifications regarding this escaping bug within Sysmon. A year and a half later the latest version (v7.01) still contains this bug. After encountering it in three separate research phases over this time period, I found it to be a simple but compelling example that should spur us as defenders to test our data collection and analysis tools before we begin relying on them for detection purposes. And we should continue to share relevant findings with the tool author(s)/owner(s) so they can be improved for the DFIR community that relies on these tools.