Build an Azure Sentinel lab - part four: free endpoint detection with Sysmon

Learn how to build a free endpoint detection capability using Sysmon and Mitre ATT&CK

November 29, 2024

Endpoint detection tooling is central to establishing a threat detection capability to protect enterprise networks. However, it is expensive and the cost is driven by the type of tooling purchased and the complexity of the corporate network.

Fortunately, we can use Sysmon to build a free endpoint detection solution that, if properly tuned, can broadly match the detection capabilities of paid tools.

As mentioned in part one, our lab is configured to deploy a special Sysmon configuration that collects log data based on known adversary behaviours, as documented within the open-source MITRE ATT&CK framework.

The magic happens in two of our Bicep modules. First, in the main.bicep deployment script, where we configure a data collection rule to automatically send Sysmon telemetry from virtual machines to our Azure Sentinel instance.

Second, in the vm.bicep script, where we use a virtual machine (VM) extension to read a GitHub-hosted, public PowerShell file to download, configure and install the Sysmon driver on the virtual machine running it. Then, we use a dataCollectionRuleAssociations resource to connect the VM’s Azure Monitor Agent (which we installed via another VM extension) to the Sysmon data collection rule we defined in main.bicep.

Using this Sysmon and ATT&CK combination in our lab, we can detect around 100 known adversary techniques at no cost. However, Azure Sentinel presents some telemetry parsing challenges that we must correct before making full use of our Sysmon data.

Sysmon telemetry optimisation

In our lab, Sysmon telemetry is collected within Sentinel’s SecurityEvent table. Specifically, Sysmon logs are retrieved by running the below query:

SecurityEvent | where EventSourceName contains "Sysmon"

The challenge is that the data is returned in an unparsed XML format. As seen in the picture below:

Screenshot of default Sysmon data in Sentinel

This makes it hard to interpret the data. Moreover, it forces us to write unnecessarily complicated queries, as seen below:

Screenshot of default Sysmon query in Sentinel

Fortunately, we can use Sentinel Functions to improve the default parsing of Sysmon data. Moreover, by using a parser we can map Sysmon fields to a vendor neutral schema, such as OSSEM, to ensure our queries remain (broadly) transposable across different SIEM platforms.

A good Sysmon parser is available in the Sentinel-ATT&CK GitHub repository. The parser extracts and normalises Sysmon event data for analysis. It handles various event types (e.g. process creation, file creation, network connections) by mapping Sysmon data fields to OSSEM standards. The parser organizes details like process IDs, command lines, file hashes, and network connections while correlating them with MITRE ATT&CK techniques.

By storing the parser as a Sentinel function, we can view Sysmon data in a more readable format. Moreover, by referencing the parser, we can write simpler KQL queries as seen below:

The parser can be manually stored in Sentinel by copy-pasting the parser code into Sentinel’s log blade and then clicking the Save > Save as function buttons. However, there is an alternative method to automatically deploy the Sysmon parser through Bicep.

Automating Sysmon parser deployment

Our Sysmon parser can be embedded within a Bicep deployment script by using a savedSearches resource. In our lab deployment script, we do so within the sentinel.bicep script. Under the solution resource definition, we add the below code:

// Store Sysmon parser
resource sysmonquery 'Microsoft.OperationalInsights/workspaces/savedSearches@2020-08-01' = {
  parent: workspace
  name: 'Sysmon'
  properties: {
    displayName: 'Sysmon'
    category: labname
    functionAlias: 'Sysmon'
    query: '*** Add parser code in here ***'
    version: 2
  }
}

Note that the parser must be stored within the query parameter in the right format. Specifically, all quotes within the query should be escaped and all newlines and carriage returns must be converted to the /n and /r symbols. The deployment script will fail persistently if the query is not stored in the right format.

The fastest way to retrieve a properly formatted Sysmon function is to use Sentinel’s Export template functionality.

First, manually save the function within Sentinel. Then, navigate to the Configuration blade and click Settings. Within the Settings menu, click on Workspace settings, then Automation, then Export template. Azure will then automatically generate a template to export your Sysmon configuration.

You will immediately notice that the generated template is quite large. To retrieve the formatted function, run a search using Sysmon as a search key. The whole process can be seen in the below GIF:

With the parser installed and automatically deployed you can now develop Sysmon endpoint detection rules more easily.

Writing endpoint detections

The combined use of Sysmon and our ATT&CK parser brings several advantages from the perspective of writing detections. First, with a simpler syntax, it becomes easier for analysts to just focus on building detection logic, rather than deal with complex, in-query parsing. Over time, this contributes to saving large amounts of engineering calories over time.

Secondly, by leveraging a Sysmon configuration mapped to ATT&CK, analysts can save further calories by simply referencing ATT&CK techniques directly within their detection queries. This can also aid threat hunting or detection tuning efforts as analysts can simply reference specific ATT&CK techniques to inspect what Sentinel is collecting, as seen below:

Finally, by leveraging a Sysmon configuration mapped to OSSEM, analysts can make use of large, open-source detection repositories available online. These can be used almost out of the box (provided sufficient tuning is done) in order to replicate (as much as is realistically possible) the behaviour of paid endpoint detection tools.

Tuning endpoint detections

The Sysmon configuration that we deployed will inevitably generate false positives. Tuning detection rules is key to reducing false positive rates. Within our lab, we can employ two fine-tuning strategies.

The first is to use the Sysmon configuration file itself. Sysmon configurations are XML-based and structured with RuleGroup elements that use an “or” relation to organize EventType rules. Each EventType can either include or exclude logs based on specific conditions such as is, contains, begin with, or end with. The onmatch="exclude" setting logs everything except explicitly excluded events, while onmatch="include" logs only explicitly included events, with misconfigurations potentially causing missing logs.

EventTypes correspond to specific EventCodes (e.g., ProcessCreate is EventCode 1). Two configuration styles can be employed for rule management. One style involves editing a prebuilt XML (see SwiftOnSecurity’s sysmon-config project for an example) using specific include or exclude rules:

<ProcessCreate onmatch="exclude">

The other style involves using separate XML files for each EventCode (as used within the sysmon-modular project ), allowing easier tracking and updates. Rules can be tailored by conditions like matching process images, but care needs to be taken to prevent exploitability:

<Sysmon schemaversion="4.30">
  <EventFiltering>
    <RuleGroup name="" groupRelation="or">
      <PipeEvent onmatch="exclude">
        <Image condition="end with">Program Files\SplunkUniversalForwarder\bin\splunkd.exe</Image>
        <Image condition="end with">Program Files\SplunkUniversalForwarder\bin\splunk.exe</Image>
        <Image condition="end with">Program Files\SplunkUniversalForwarder\bin\splunk-MonitorNoHandle.exe</Image>
      </PipeEvent>
    </RuleGroup>
  </EventFiltering>
</Sysmon>

Directly tuning the Sysmon configuration file produces more efficient results. This way, Sysmon filters false positives before forwarding them to Sentinel. The downside is that you may accidentally lose visibility on key data.

The second, short-term strategy is to exclude known false positives by using Sentinel watchlists. Sentinel watchlists are customizable data collections that enable security teams to reference, correlate or exclude specific information—such as IP addresses, user accounts, or domains—within analytics rules, threat detection, and investigations.

For example, when querying for Sysmon DNS events we may (for example) want to exclude DNS queries to known benign domains. To do so we may create a Sentinel watchlist using a csv file where we may note all hosts, process paths and dns queries that we may want to exclude when querying Sysmon DNS event data. In the example csv file below we exclude DNS queries to our lab’s domain controller, called alfa-dc-win10.soclab.local.

host,process_path,query_name
alfa-pc13-win10.soclab.local,,alfa-dc-win10.soclab.local

By loading the above data within a Sentinel watchlist called dns_watchlist we may be able to exclude uninteresting DNS data via queries similar to the below:

let watchlist = (_GetWatchlist('dns_watchlist') | project query_name);
Sysmon
| where EventID == 22
| where dns_query_name !in (watchlist)

As can be seen, the Kusto query retrieves a list of query_name values from the Azure Sentinel watchlist named dns_watchlist and filters Sysmon logs with EventID == 22 to exclude any entries where the dns_query_name matches a value in the watchlist. By using the above technique, we reduce the amount of noise in the returned data while preserving the original telemetry collected by Sysmon. Naturally, you can use the above technique within all of your Azure Sentinel detection rules.

Conclusion

Implementing Sysmon as a free endpoint detection solution offers a cost-effective way to enhance threat detection capabilities, especially when paired with the MITRE ATT&CK framework and optimized using parsing tools like OSSEM-mapped parsers.

By automating the deployment and tuning of Sysmon configurations and leveraging Azure Sentinel features such as watchlists, analysts can streamline data parsing, simplify detection rule creation, and reduce false positives.

While fine-tuning the Sysmon configuration provides direct efficiency gains, using watchlists offers a flexible short-term solution to exclude benign events. Together, these strategies enable organizations to replicate many functionalities of paid endpoint detection tools while saving substantial costs.