This is the fifth post from Google Cloud Principal Security Strategist John Stoner as part of his deep-dive "New to Chronicle" series, which helps propel forward security teams either new to SIEM or replacing their SIEM with Chronicle.
I’ve been holding back on writing this one for a little while because before diving into this topic, it is important to understand the foundational elements of rules. So for anyone following along and wanting to go beyond basic string matches using equal and not equal to, thank you for your patience and I hope to reward that!
There is a lot to say about functions in general, and regular expression functions more specifically, so we will not cover all functions in this single post. In fact, we aren’t even going to cover all of the regular expression functions today. Stick with me, more to come, but I hope the next few in this blog series will increase your overall functionality! (Sorry, not sorry.)
Since this is our first step in functions, we will provide a brief overview of them in general. Functions are used today primarily in the events section of YARA-L rules. They can also be used in the outcome section as part of a condition. While there are functions for working with strings, dates, IPv4 and IPv6 addresses and numeric values, today our focus will be on matching using regular expression functions.
Before we dive in, here is my brief disclaimer. For folks who eat regular expressions for breakfast and enjoy debating the differences between re2 (which we use) and other regex flavors out there (and you know who you are), you may have a better method for handing these strings in regular expressions and that’s cool. I don’t do either of these things, but I do like a good bagel with scallion cream cheese, so if my regular expressions aren’t as optimized as they could be, hopefully you will still appreciate what we are trying to do. Onward!
If you recall back in the single-event rule blog post we used one method to apply regular expressions to our search. If you don't remember, check out the post.
Regular expressions can be conveyed in the following manner, as we’ve already discussed.
$event.target.process.command_line = /reg.*query HKLM \/f password \/t REG_SZ \/s/ nocase
It can also be conveyed with the re.regex function. Either way, our intent is to find matching strings within a UDM event. One important distinction to call out is that if we are performing regular expression matching in a search, we must use the above syntax. Functions are currently used in the rules engine, as mentioned earlier.
Using the re.regex function, the equivalent criteria below will return the same results.
($event.target.process.command_line, `reg.*query HKLM /f password /t REG_SZ /s`) nocase
The differences in the function approach are bolded. Notice that we define the function and enclose the field and regular expression with parentheses while separating them with a comma. Re.regex functions do not have a comparison either, so no equal signs will be seen unless it is part of a regex string. For other regular expression functions, that isn’t the case, but you will have to wait until next time to see those in action.
We will enclose the regular expression in back quotes ( ` ) and not forward slashes ( / ) like we did in our initial example. Because forward slashes are not used to indicate a regular expression in the function, we don’t need to use the backslash to escape the forward slashes found in the string. We can still use notation like .* to indicate one or more characters (or spaces) between reg and query in both examples (and yes I realize .* is a bit greedy).
Let’s look at another example. Suppose we want to determine when a process is triggered from PowerShell. We could use something like this:
$event.principal.process.file.full_path = /powershell/ nocase
Despite this criteria being extremely broad, it might be enough to get us started. However, if we wanted greater specificity around the conditions that must be met before a rule is triggered, we should probably proceed with one of these options. Again, these are equivalent.
$event.principal.process.file.full_path = /System32\\WindowsPowerShell\\v1\.0\\powershell\.exe/ nocase
re.regex($event.principal.process.file.full_path, `System32\\WindowsPowerShell\\v1\.0\\powershell\.exe`) nocase
The first thing that you may have noticed is that we didn’t start our match with a drive letter, even though all values for this field start with C:\Windows\. This is because .* or other notation is not needed to accommodate leading or trailing strings in the regular expression ahead of the match.
Let’s talk about those backslashes.
Hence the double backslash. The period also has special meaning in regular expressions, so if we want it to be treated as a period, we need to put a backslash in front of it as well.
Alright, we have a reasonably precise line of event criteria to isolate on PowerShell execution. Perhaps some of you are thinking, great, but what about PowerShell running from the integrated scripting environment (ISE)? And what about PowerShell being executed with the 32-bit executable rather than the 64-bit version. How can our event criteria align to this while still maintaining precision?
A quick sidenote: You may be thinking that /powershell/ is enough for our rule in this example and all of this extra tuning is just overhead and in some cases, that may be true. But keep in mind these concepts can be applied throughout rules, so while /powershell/ may be sufficient here, it may be valuable in other circumstances to more finely tune our event criteria.
The good news is that our regular expression can be extended to take into account these file folders and names. Below are two examples of 32-bit PowerShell executables with their associated file location. 64-bit PowerShell would be seen in System32 which is what our regular expression accounted for. The 32-bit instance is in the SysWOW64 folder and the ISE executable appends _ise to powershell.
We could create an OR statement to account for these permutations, but we can also modify our regular expression to account for those different folder and file names by introducing a parenthesis and pipe to provide the option of matching system32 OR syswow64 at the folder level and powershell.exe OR powershell_ise.exe at the file level. The bold sections of the criteria below illustrate these changes.
$event.principal.process.file.full_path = /(system32|syswow64)\\WindowsPowerShell\\v1\.0\\powershell(|\_ise)\.exe/ nocase
re.regex($event.principal.process.file.full_path, `(system32|syswow64)\\WindowsPowerShell\\v1\.0\\powershell(|\_ise)\.exe`) nocase
You may be wondering if we can use double quotes ( " ) as well; after all we use double quotes with string matches. While we can, it is important to understand the difference between the behavior of back quotes versus double quotes. Back quotes are going to interpret all characters literally, which for a regular expression, makes sense. The use of double quotes as regular expression constants creates some additional complexity that you may want to avoid. Complexity, you say? Take our PowerShell regular expression as an example and see what is required when using double quotes.
re.regex($event.principal.process.file.full_path, "System32\\\\WindowsPowerShell\\\\v1\\.0\\\\powershell\\.exe") nocase
I don’t often find myself quoting documentation, but the Chronicle YARA-L 2.0 language syntax reference says it best. “Note that for double quote string constants, you must escape backslash characters with backslash characters, which can look awkward.”
Truer words have never been spoken.
I hope this provides you a greater understanding of how to use regular expressions when writing rules. Remember that there are two different methods to do this and both will work effectively. Next time we will look at more regular expression functions!