Path: blob/master/tutorials-and-examples/example-notebooks/Example - Guided Investigation - Process-Alerts.ipynb
3253 views
Title: Alert Investigation (Windows Process Alerts)
Notebook Version: 1.0
Python Version: Python 3.10 (including Python 3.10 - SDK v2 - AzureML)
Required Packages: kqlmagic, msticpy, pandas, numpy, matplotlib, networkx, ipywidgets, ipython, scikit_learn
Platforms Supported:
Azure Notebooks Free Compute
Azure Notebooks DSVM
OS Independent
Data Sources Required:
Log Analytics - SecurityAlert, SecurityEvent (EventIDs 4688 and 4624/25)
(Optional) - VirusTotal (with API key)
Description:
This notebook is intended for triage and investigation of security alerts. It is specifically targeted at alerts triggered by suspicious process activity on Windows hosts. Some of the sections will work on other types of alerts but this is not guaranteed.
Warning: Example Notebook - No longer supported!
 This notebooks is meant to be illustrative of specific scenarios and is not actively maintained. It is unlikely to be runnable directly in your environment. Instead, please use the notebooks in the root of this repo.Â
Table of Contents
Setup
Make sure that you have installed packages specified in the setup (uncomment the lines to execute)
There are some manual steps up to selecting the alert ID. After this most of the notebook can be executed sequentially
Major sections should be executable independently (e.g. Alert Command line and Host Logons can be run skipping Session Process Tree)
Install Packages
The first time this cell runs for a new Azure Notebooks project or local Python environment it will take several minutes to download and install the packages. In subsequent runs it should run quickly and confirm that package dependencies are already installed. Unless you want to upgrade the packages you can feel free to skip execution of the next cell.
If you see any import failures (ImportError) in the notebook, please re-run this cell and answer 'y', then re-run the cell where the failure occurred.
Note you may see some warnings about package incompatibility with certain packages. This does not affect the functionality of this notebook but you may need to upgrade the packages producing the warnings to a more recent version.
Import Python Packages
Get WorkspaceId
To find your Workspace Id go to Log Analytics. Look at the workspace properties to find the ID.
Read Workspace configuration from local config.json for workspace ASIHuntOMSWorkspaceV4
TENANT_ID: 72f988bf-86f1-41af-91ab-2d7cd011db47
SUBSCRIPTION_ID: 40dcc8bf-0478-4f3b-b275-ed0a94f2c013
RESOURCE_GROUP: ASIHuntOMSWorkspaceRG
WORKSPACE_ID: 52b1ab41-869e-4138-9e40-2a4457f09bf0
WORKSPACE_NAME: ASIHuntOMSWorkspaceV4
Authenticate to Log Analytics
If you are using user/device authentication, run the following cell.
Click the 'Copy code to clipboard and authenticate' button.
This will pop up an Azure Active Directory authentication dialog (in a new tab or browser window). The device code will have been copied to the clipboard.
Select the text box and paste (Ctrl-V/Cmd-V) the copied value.
You should then be redirected to a user authentication page where you should authenticate with a user account that has permission to query your Log Analytics workspace.
Use the following syntax if you are authenticating using an Azure Active Directory AppId and Secret:
instead of
Note: you may occasionally see a JavaScript error displayed at the end of the authentication - you can safely ignore this.
On successful authentication you should see a popup schema button.
Get Alerts List
Specify a time range to search for alerts. One this is set run the following cell to retrieve any alerts in that time window. You can change the time range and re-run the queries until you find the alerts that you want.
Choose Alert to Investigate
Either pick an alert from a list of retrieved alerts or paste the SystemAlertId into the text box in the following section.
Select alert from list
As you select an alert, the main properties will be shown below the list.
Use the filter box to narrow down your search to any substring in the AlertName.
Or paste in an alert ID and fetch it
Skip this if you selected from the above list
Extract properties and entities from Alert
This section extracts the alert information and entities into a SecurityAlert object allowing us to query the properties more reliably.
In particular, we use the alert to automatically provide parameters for queries and UI elements. Subsequent queries will use properties like the host name and derived properties such as the OS family (Linux or Windows) to adapt the query. Query time selectors like the one above will also default to an origin time that matches the alert selected.
The alert view below shows all of the main properties of the alert plus the extended property dictionary (if any) and JSON representations of the Entity.
Entity Graph
Depending on the type of alert there may be one or more entities attached as properties. Entities are things like Host, Account, IpAddress, Process, etc. - essentially the 'nouns' of security investigation. Events and alerts are the things that link them in actions so can be thought of as the verbs. Entities are often related to other entities - for example a process will usually have a related file entity (the process image) and an Account entity (the context in which the process was running). Endpoint alerts typically always have a host entity (which could be a physical or virtual machine).
Plot using Networkx/Matplotlib
Related Alerts
For a subset of entities in the alert we can search for any alerts that have that entity in common. Currently this query looks for alerts that share the same Host, Account or Process and lists them below. Notes:
Some alert types do not include all of these entity types.
The original alert will be included in the "Related Alerts" set if it occurs within the query time boundary set below.
The query time boundaries default to a longer period than when searching for the alert. You can extend the time boundary searched before or after the alert time. If the widget doesn't support the time boundary that you want you can change the max_before and max_after parameters in the call to QueryTime below to extend the possible time boundaries.
Show these related alerts on a graph
This should indicate which entities the other alerts are related to.
This can be unreadable with a lot of alerts. Use the matplotlib interactive zoom control to zoom in to part of the graph.
Browse List of Related Alerts
Select an Alert to view details.
If you want to investigate that alert - copy its SystemAlertId property and open a new instance of this notebook to investigate this alert.
Get Process Tree
If the alert has a process entity this section tries to retrieve the entire process tree to which that process belongs.
Notes:
The alert must have a process entity
Only processes started within the query time boundary will be included
Ancestor and descented processes are retrieved to two levels (i.e. the parent and grandparent of the alert process plus any child and grandchild processes).
Sibling processes are the processes that share the same parent as the alert process
This can be a long-running query, especially if a wide time window is used! Caveat Emptor!
The source (alert) process is shown in red.
What's shown for each process:
Each process line is indented according to its position in the tree hierarchy
Top line fields:
[relationship to source process:lev# - where # is the hops away from the source process]
Process creation date-time (UTC)
Process Image path
PID - Process Id
SubjSess - the session Id of the process spawning the new process
TargSess - the new session Id if the process is launched in another context/session. If 0/0x0 then the process is launched in the same session as its parent
Second line fields:
Process command line
Account - name of the account context in which the process is running
Process TimeLine
This shows each process in the process tree on a timeline view.
Labelling of individual process is very performance intensive and often results in nothing being displayed at all! Besides, for large numbers of processes it would likely result in an unreadable mess.
Your main tools for negotiating the timeline are the Hover tool (toggled on and off by the speech bubble icon) and the wheel-zoom and pan tools (the former is an icon with an elipse and a magnifying glass, the latter is the crossed-arrows icon). The wheel zoom is particularly useful.
As you hover over each process it will display the image name, PID and commandline.
Also shown on the graphic is the timestamp line of the source/alert process.
Other Processes on Host - Clustering
Sometimes you don't have a source process to work with. Other times it's just useful to see what else is going on on the host. This section retrieves all processes on the host within the time bounds set in the query times widget.
You can display the raw output of this by looking at the processes_on_host dataframe. Just copy this into a new cell and hit Ctrl-Enter.
Usually though, the results return a lot of very repetitive and unintersting system processes so we attempt to cluster these to make the view easier to negotiate. To do this we process the raw event list output to extract a few features that render strings (such as commandline)into numerical values. The default below uses the following features:
commandLineTokensFull - this is a count of common delimiters in the commandline (given by this regex r'[\s-\/.,"'|&:;%$()]'). The aim of this is to capture the commandline structure while ignoring variations on what is essentially the same pattern (e.g. temporary path GUIDs, target IP or host names, etc.)
pathScore - this sums the ordinal (character) value of each character in the path (so /bin/bash and /bin/bosh would have similar scores).
isSystemSession - 1 if this is a root/system session, 0 if anything else.
Then we run a clustering algorithm (DBScan in this case) on the process list. The result groups similar (noisy) processes together and leaves unique process patterns as single-member clusters.
Clustered Processes (i.e. processes that have a cluster size > 1)
Variability in Command Lines and Process Names
The top chart shows the variability of command line content for a give process name. The wider the box, the more instances were found with different command line structure
Note, the 'structure' in this case is measured by the number of tokens or delimiters in the command line and does not look at content differences. This is done so that commonly varying instances of the same command line are grouped together.
For example updatepatch host1.mydom.com and updatepatch host2.mydom.com will be grouped together.
The second chart shows the variability in executable path. This does compare content so c:\windows\system32\net.exe and e:\windows\system32\net.exe are treated as distinct. You would normally not expect to see any variability in this chart unless you have multiple copies of the same name executable or an executable is trying masquerade as another well-known binary.
The top graph shows that, for a given process, some have a wide variability in their command line content while the majority have little or none. Looking at a couple of examples - like cmd.exe, powershell.exe, reg.exe, net.exe - we can recognize several common command line tools.
The second graph shows processes by full process path content. We wouldn't normally expect to see variation here - as is the cast with most. There is also quite a lot of variance in the score making it a useful proxy feature for unique path name (this means that proc1.exe and proc2.exe that have the same commandline score won't get collapsed into the same cluster).
Any process with a spread of values here means that we are seeing the same process name (but not necessarily the same file) is being run from different locations.
You can view the cluster members for individual processesby inserting a new cell and entering:>>> view_cluster(process_name)
where process_name is the unqualified process binary. E.g>>> view_cluster('reg.exe')
Time showing clustered vs. original data
Base64 Decode and Check for IOCs
This section looks for Indicators of Compromise (IoC) within the data sets passed to it.
The first section looks at the commandline for the alert process (if any). It also looks for base64 encoded strings within the data - this is a common way of hiding attacker intent. It attempts to decode any strings that look like base64. Additionally, if the base64 decode operation returns any items that look like a base64 encoded string or file, a gzipped binary sequence, a zipped or tar archive, it will attempt to extract the contents before searching for potentially interesting items.
If we have a process tree, look for IoCs in the whole data set
You can replace the data=process_tree parameter to ioc_extractor.extract() to pass other data frames. use the columns parameter to specify which column or columns that you want to search.
If any Base64 encoded strings, decode and search for IoCs in the results.
For simple strings the Base64 decoded output is straightforward. However for nested encodings this can get a little complex and difficult to represent in a tabular format.
Columns
reference - The index of the row item in dotted notation in depth.seq pairs (e.g. 1.2.2.3 would be the 3 item at depth 3 that is a child of the 2nd item found at depth 1). This may not always be an accurate notation - it is mainly use to allow you to associate an individual row with the reference value contained in the full_decoded_string column of the topmost item).
original_string - the original string before decoding.
file_name - filename, if any (only if this is an item in zip or tar file).
file_type - a guess at the file type (this is currently elementary and only includes a few file types).
input_bytes - the decoded bytes as a Python bytes string.
decoded_string - the decoded string if it can be decoded as a UTF-8 or UTF-16 string. Note: binary sequences may often successfully decode as UTF-16 strings but, in these cases, the decodings are meaningless.
encoding_type - encoding type (UTF-8 or UTF-16) if a decoding was possible, otherwise 'binary'.
file_hashes - collection of file hashes for any decoded item.
md5 - md5 hash as a separate column.
sha1 - sha1 hash as a separate column.
sha256 - sha256 hash as a separate column.
printable_bytes - printable version of input_bytes as a string of \xNN values
src_index - the index of the row in the input dataframe from which the data came.
full_decoded_string - the full decoded string with any decoded replacements. This is only really useful for top-level items, since nested items will only show the 'full' string representing the child fragment.
Virus Total Lookup
This section uses the popular Virus Total service to check any recovered IoCs against VTs database.
To use this you need an API key from virus total, which you can obtain here: https://www.virustotal.com/.
Note that VT throttles requests for free API keys to 4/minute. If you are unable to process the entire data set, try splitting it and submitting smaller chunks.
Things to note:
Virus Total lookups include file hashes, domains, IP addresses and URLs.
The returned data is slightly different depending on the input type
The VTLookup class tries to screen input data to prevent pointless lookups. E.g.:
Only public IP Addresses will be submitted (no loopback, private address space, etc.)
URLs with only local (unqualified) host parts will not be submitted.
Domain names that are unqualified will not be submitted.
Hash-like strings (e.g 'AAAAAAAAAAAAAAAAAA') that do not appear to have enough entropy to be a hash will not be submitted.
Output Columns
Observable - The IoC observable submitted
IoCType - the IoC type
Status - the status of the submission request
ResponseCode - the VT response code
RawResponse - the entire raw json response
Resource - VT Resource
SourceIndex - The index of the Observable in the source DataFrame. You can use this to rejoin to your original data.
VerboseMsg - VT Verbose Message
ScanId - VT Scan ID if any
Permalink - VT Permanent URL describing the resource
Positives - If this is not zero, it indicates the number of malicious reports that VT holds for this observable.
MD5 - The MD5 hash, if any
SHA1 - The MD5 hash, if any
SHA256 - The MD5 hash, if any
ResolvedDomains - In the case of IP Addresses, this contains a list of all domains that resolve to this IP address
ResolvedIPs - In the case Domains, this contains a list of all IP addresses resolved from the domain.
DetectedUrls - Any malicious URLs associated with the observable.
To view the raw response for a specific row.
Alert command line - Occurrence on other hosts in workspace
To get a sense of whether the alert process is something that is occuring on other hosts, run this section.
This might tell you that the alerted process is actually a commonly-run process and the alert is a false positive. Alternatively, it may tell you that a real infection or attack is happening on other hosts in your environment.
Host Logons
This section retrieves the logon events on the host in the alert.
You may want to use the query times to search over a broader range than the default.
All Host Logons
Since the number of logon events may be large and, in the case of system logons, very repetitive, we use clustering to try to identity logons with unique characteristics.
In this case we use the numeric score of the account name and the logon type (i.e. interactive, service, etc.). The results of the clustered logons are shown below along with a more detailed, readable printout of the logon event information. The data here will vary depending on whether this is a Windows or Linux host.
Comparing All Logons with Clustered results relative to Alert time line
View Process Session and Logon Events in Timelines
This shows the timeline of the clustered logon events with the process tree obtained earlier. This allows you to get a sense of which logon was responsible for the process tree session whether any additional logons (e.g. creating a process as another user) might be associated with the alert timeline.
Note you should use the pan and zoom tools to align the timelines since the data may be over different time ranges.
Failed Logons
Appendices
Available DataFrames
Saving Data to CSV
To save the contents of a pandas DataFrame to an CSV use the following syntax
Saving Data to Excel
To save the contents of a pandas DataFrame to an Excel spreadsheet use the following syntax