Path: blob/master/ConfiguringNotebookEnvironment.ipynb
3249 views
Notebook Environment Setup
This notebook takes you through detailed setup of your settings for Microsoft Sentinel Notebooks and the MSTICPy library. It covers:
Setting up your Python environment for notebooks
Creating and editing your msticpyconfig.yaml file
Understanding and managing you config.json file.
If you are using notebooks in the Microsoft Sentinel/Azure ML environment you can skip the first section "Configuring your Python Environment" entirely.
Warning. Due to rendering issues in Azure Machine Learning, we strongly recommend running this notebook in Jupyter Lab or VSCode.
- Click on the notebook toolbar menu - the ≣ symbol in top left of the notebook
- Select the Editors option and choose either JupyterLab or VSCode
- When prompted to stay in AML click the Continue button.
- The notebook should open in another browser tab
The main part of this notebook involves setting up your msticpyconfig.yaml. While many of these settings are optional, if you do not configure them correctly you'll experience some loss of functionality. For example, using Threat Intelligence providers usually requires an API key. To save you having to type this in every time you look up an IP Address you should put this in a config file.
This section takes you through creating settings for
Microsoft Sentinel workspaces
Threat Intelligence providers
Geo-location providers
Other data providers (e.g. Azure APIs)
Key Vault
Auto-loading options.
You'll typically need the first three of these to use most of the notebooks fully.
Section 3, "The config.json file" can also be ignored if you are happy using msticpyconfig.yaml. It is included here for background.
Contents
- Configuring your Python Environment
- MSTICPY Configuration File
msticpyconfig.yaml- Display your existing msticpyconfig.yaml
- Import your Config.json and create a msticpyconfig.yaml [Microsoft Sentinel]
- Setting the path to your msticpyconfig.yaml
- Verify (or add) Microsoft Sentinel Workspace settings
- Adding Threat Intel (TI) Providers
- Adding GeoIP Providers
- Optional Settings 1 - Azure Data and Microsoft Sentinel APIs
- Optional Settings 2 - Autoload QueryProviders
- Optional Settings 3 - Autoloaded Component
- Save your file and add the MSTICPYCONFIG environment variable
- Validating your msticpyconfig.yaml settings
- The
config.jsonfile
Configuring your Python Environment
Python 3.6 or Later
If you are running in Jupyterhub environment such as Azure Notebooks, Python is already installed. When using any of the sample notebooks or copies of them you only need to ensure that the Python 3.8 (or later) kernel is selected.
If you are running the notebooks locally will you need to install Python 3.8 or later. The Ananconda distribution is a good starting point since it comes with many required packages already installed.
Creating a virtual environment
If you are running these notebooks locally, it is a good idea to create a clean Python virtual environment, before installing any of the packages . This will prevent installed packages conflicting with versions that you may need for other applications.
For standard python use the venv command. For Conda use the conda env command. In both cases be sure to activate the environment before running jupyter using venvpath/Scripts/activate or conda activate {my_env_name}.
Installing in a Conda Environment
Although you can use pip inside a conda environment it is usually better to try to install conda packages whenever possible.
See Managing packages in Anaconda.
For packages that are not available as conda packages use pip from with a Conda prompt/shell to install the remaining packages.
Installing with --user option
If you are using a shared installation of Python (i.e. one installed by the administrator) you will need to add the --user option to your pip install commands. E.g.
This will avoid permission errors by installing into your user folder.
Note: the use of the
--useroption is usually not required in a Conda environment since the Python site packages are normally already installed in a per-user folder.
Install Packages from this Notebook
The first time this cell runs for a new Azure ML or Azure Notebooks notebook or other Python environment it will do the following things:
Check the kernel version to ensure that a Python 3.6 or later kernel is running
Check the msticpy version - if this is not installed or the version installed is less than the required version (in
REQ_MSTICPY_VER) it will attempt to install a new version (you will be prompted whether you want to do this) The install can take several minutes depending on the versions of packages that you already have installed.Once msticpy is installed and imported, the
init_notebookfunction is run. This:imports common modules used in the notebook
installs additional packages
sets some global options
Note: In subsequent runs, this cell should run quickly since you will already have the required packages installed.
Warning: you may see some warnings about incompatibility with certain packages. This should not affect the functionality of this notebook but you may need to upgrade the packages producing the warnings to a more recent version.
MSTICPy Configuration File - msticpyconfig.yaml
MSTICPy is a Python package used in most of the Jupyter notebooks on Azure-Sentinel-Notebooks. It provides a lot of functionality specific to threat hunting and investigations, including:
Data querying against Microsoft Sentinel tables (also MDE, Splunk and other)
Threat Intelligence lookups using multiple TI providers (VirusTotal, AlienVault OTX and others)
Common enrichment functions (GeoIP, IoC extraction, WhoIs, etc.)
Visualization using event timelines, process trees and Geo-mapping
Advanced analysis such as Time Series decomposition, Anomaly detection and clustering.
Note: the configuration actions in this section are an abbreviated version of the MPSettingsEditor notebook
Use this notebook for a fuller guide on how to configure your settings.
Also, see these sections in the MSTICPy documentation:
MSTICPy Package Configuration
MSTICPy Settings Editor
config.json provides some basic configuration for connecting to your Microsoft Sentinel workspace. However, there are many features that require additional configuration information. Some examples are:
Threat Intelligence Provider connection information
GeoIP connection information
Keyvault configuration for storing secrets remotely
MDE and Azure API connection information.
Connection information for multiple Microsoft Sentinel workspaces.
Settings for these are stored in the msticpyconfig.yaml file. This file is read from the current directory or you can set an environment variable (MSTICPYCONFIG) pointing to its location. Form more information about msticpy configuration see msticpy Package Configuration.
The most commonly-used sections are described below.
Threat Intelligence Provider Setup
For more information on the msticpy Threat Intel lookup class see the documentation here.
Primary providers are used by default. Secondary providers are not run by default but can be invoked by using the providers parameter to lookup_ioc() or lookup_iocs(). Set the Primary config setting to True or False for each provider ID according to how you want to use them. The providers parameter should be a list of strings identifying the provider(s) to use.
The provider ID is given by the
Provider:setting for each of the TI providers - do not alter this value.Delete or comment out the section for any TI Providers that you do not wish to use.
For most providers you will usually need to supply an authorization (API) key and in some cases a user ID for each provider.
For the Microsoft Sentinel TI provider, you will need the workspace ID and tenant ID and will need to authenticate in order to access the data (although if you have an existing authenticated connection with the same workspace/tenant, this connection will be re-used).
GeoIP Providers
Like the TI providers these services normally need an API key to access. You can read more about configuration the supported providers here. msticpy GeoIP Providers
Browshot Setup
The functionality to screenshot a URL in msticpy.sectools.domain_utils relies on a service called BrowShot (https://browshot.com/). An API key is required to use this service and it needs to be defined in the msticpyconfig file as well. As this is not a threat intelligence provider it doesn't not fall under the TIProviders section of msticpyconfig but instead sits alone. See the cell below for example configuration.
Display your existing msticpyconfig.yaml
We'll be using some of the MSTICPy configuration tools: MPConfigEdit and MPConfigFile, so we'll import these first
Then run MpConfig file to view your current settings.
If you see nothing but a pair of curly braces...
...in the settings view above it means that you probably need to create up a msticpyconfig.yaml
If you know that you have configured a msticpyconfig file, you can search for this file using MpConfigFile. Click on Load file. Once you've done that go to the Setting the path to your msticpyconfig.yaml
Import your Config.json and create a msticpyconfig.yaml [Microsoft Sentinel]
Follow these steps:
Run MpConfigFile
Locate your config.json
click Load file button
Browse - use the controls to navigate to find config.json
Search - set the starting directory to search and open the Search drop-down
When you see the file click on it and click Select File button (below the file browser)
optionally, click View Settings to confirm that this looks right
Convert to convert to msticpyconfig format
click View Settings
Save your
msticpyconfig.yamlfiletype a path into the Current file text box
Click on Save file
You can set this file to always load by assigning the path to an environment variable. See Setting the path to your msticpyconfig.yaml
Setting the path to your msticpyconfig.yaml
This is a good point to set up an environment variable so that you can keep a single configuration file in a known location and always load the same settings. (Of course, you're free to use multiple configs if you need to use different settings for each notebook folder)
decide on a location for your
msticpyconfig.yaml- this could be in "~/.msticpyconfig.yaml" or "%userprofile%/msticpyconfig.yaml"copy the
msticpyconfig.yamlfile that you just created to this location.set the
MSTICPYCONFIGenvironment variable to point to that location:
Windows

Linux
In your .bashrc (or somewhere else convenient) add:
export MSTICPYCONFIG=~/.msticpyconfig.yaml
Azure ML
In Azure ML, you need to decide whether to store your msticpyconfig.yaml in the AML file store or on the Compute file system. If you have any secret key material in the file, we recommend storing on the Compute instance, since the AML file store is shared storage, whereas the Compute instance is accessible only by the user who created it.
If you are happy to leave the file in the AML file store, you should be set. The init_notebook function run at the start of the notebook will find it there in your root folder and set the MSTICPYCONFIG environment variable to point to it.
Pointing to a path on a compute instance
Open a terminal in AML

Verify your msticpyconfig.yaml is accessible
Your current directory should be your AML file store home directory (this is mounted in the Compute Linux system) and the prompt will look something like the example below.
If you created a
msticpyconfig.yamlin the previous step, this should be visible if you typels.Move the file to your home folder
Add an environment variable Because the Jupyter server is started before you connect its process will not inherit and environment variables from you .bashrc You can set it one of two places:
The
kernel.jsonfile for your Python kernel (there are kernels for both Python 3.6 and Python 3.8Add a Python file
nbuser_settings.pyto the root of your user folder.
These options are described in the following sections.
kernel.json
Python 3.8 location:
/usr/local/share/jupyter/kernels/python38-azureml/kernel.jsonPython 3.6 location:
/usr/local/share/jupyter/kernels/python3-azureml/kernel.json
Make a copy of the file and open the original in an editor (you many need to use sudo to be able to overwrite this file). The file will look something like this
Add the following line after the "language" item.
Your file should look like this (remember to add a comma at the end of the "language": "python" line
If you use both kernels you will need to edit both files.
nbuser_settings.py
Create this file (you can do this from the AML workspace) in the root of your user folder (i.e. inside the folder with your username) and add the following lines
This file, if it exists, is imported by the nb_check.check_versions function at the start of the notebook. It will set the environment variable at the start of each notebook before any configuration is read. This is simpler and less intrusive than editing the kernel.json. However, it only works if you run check_versions. If you load a notebook without running this MSTICPy may not be able to find its configuration file.
Verify (or add) Microsoft Sentinel Workspace settings
If you loaded a config.json file into your msticpyconfig.yaml, you should see your workspace displayed when you run the following cell. If not, you can add one or more workspaces here. The Name, WorkspaceId and TenantId are mandatory. The other fields are helpful but not essential.
Use the Help drop-down panel to find more information about adding workspaces and finding the correct values for your workspace.
If this the workspace that you use frequently or all of the time, you may want to set this as the default. This creates a duplicate entry named "Default". This is used when you connect to AzureSentinel without needing to supply a workspace name. You can override this by specifying a workspace name at connect time, which you need to do if you are working with multiple workspaces.
When you've finished, type a file name (usually "msticpyconfig.yaml") into the Conf File text box and click Save File,
You can also try the Validate Settings button. This should show that you have a few missing sections (we'll fill these in later) but should show nothing under the the "Type Validation Results".
Adding Threat Intel (TI) Providers
You will likely want to do lookups of IP Addresses, URLs and other items to check for any Threat Intelligence reports. To do that you need to add the providers that you want to use. Most TI providers require that you have an account with them and supply an API key or other authentication items when you connect.
Most providers have a free use tier (or in cases like AlienVault OTX) are entirely free. Free tiers for paid providers usually impose a certain number of requests that you can make in a given time period.
For account creation, each provider does this slightly differently. Use the help links in the editor help to find where to go set each of these up.
Assuming that you have done this, we can configure a provider. Be sure to store any authentication keys somewhere safe (and memorable).
We are going to use VirusTotal (VT) as an example TI Provider. For this you will need a VirusTotal API key from the VirusTotal website.
We also support a range of other threat intelligence providers - you can read about this here MSTICPy TIProviders
Taking VirusTotal as our example.
Click on the TI Providers tab
Select "VirusTotal" from the New prov drop-down list
Click Add
This should show you the values that you need to provide:
a single item AuthKey (this is usually referred to as an "API Key"
You can paste the key into the Value field and click the Save button.
You can opt to store the VT AuthKey as an environment variable. This is a bit more secure than having it laying around in configuration files. Assuming that you have set you VT key as an environment variable
Flip the Storage radio button to EnvironmentVar and type the name of the variable (VT_KEY in our example) into the value box.
You can also use Azure Key Vault to store secrets like these but we will need to set up the Key Vault settings before this will work.
Click the Save File button to save your changes.
Adding GeoIP Providers
MSTICPy supports two Geo IP providers - Maxmind GeoIPLite and IP Stack. The main difference between the two is that Maxmind downloads and uses a local database, while IPStack is a purely online solution.
For either you need API keys to either download the free database from MaxMind or access the IPStack online lookup
We'll use GeoIPLite as our example. You can sign up for a free account and API key at https://www.maxmind.com/en/geolite2/signup. You'll need the API for the following steps.
Select "GeoIPLite" from the New Prov
Click Add
Paste your Maxmind key into the Value field
Set the maxmind data folder:
This defaults to "~/.msticpy"
On Windows this translates to the foldername
%USERPROFILE%/.msticpy.On Linux/Mac this translates to the folder
.msticpyin your home folder.
This is where the downloaded GeopIP database will be stored.
Choose another folder name and location if you prefer.
Note: as with the TI providers you can opt to store your key as an environment variable or keep it in Key Vault.
Important Security Note
You might not be too comfortable leaving API keys stored in text files. You can opt to have these settings stored either:
as Environment Variables
in Azure Key Vault
To see how to do this see these resources
Optional Settings 1 - Azure Data and Microsoft Sentinel APIs
Azure API and Microsoft Sentinel API
To access Azure APIs (such as the Sentinel APIs or Azure resource APIs) you need to be able to use Azure Authentication. The setting is named "AzureCLI" for historical reasons - don't let that confuse you. We currently support two ways of authenticating:
Chained chained authentication (recommended)
With a client app ID and secret
The former can try up to four methods of authentication:
Using creds set in environment variables
Using creds available in an AzureCLI logon
Using the Managed Service Identity (MSI) credentials of the machine you are running the notebook kernel on
Interactive browser logon
To use chained authentication methods select the methods to want to use and leave the clientId/tenantiId/clientSecret fields empty.
Optional Settings 2 - Autoload QueryProviders
This section controls which, if any query providers you want to load automatically when you run nbinit.init_notebook.
This can save a lot of time if you are frequently authoring new notebooks. It also allows the right providers to be loaded before other components that might use them such as
Pivot functions
Notebooklets (more about these in the next section)
There are two types of provider support:
Microsoft Sentinel - here you specify both the provider name and the workspace name that you want to connect to.
Other providers - for other query providers, just specify the name of the provider.
Available Microsoft Sentinel workspaces are taken from the items you configured in the Microsoft Sentinel tab. Other providers are taken from the list of available provider types in MSTICPy.
There are two options for each of these:
connect - if this is True (checked) MSTICPy will try to authenticate to the provider backend immediately after loading. This assumes that you've configured credentials for the provider in your settings. Note: if this is not set it defaults to True.
alias - when MSTICPy loads a provider it assigns it to a Python variable name. By default this is "qry_workspace_name" for Microsoft Sentinel providers and "qry_provider_name" for other providers. If you want to use something a bit shorter and easier to type/remember you can add a alias. The variable name created will be "qry_alias"
Note if you lose track of which providers have been loaded by this mechanism they are added to the
current_providersattribute ofmsticpy
Optional Settings 3 - Autoloaded Component
This section controls which, if other components you want to load automatically when you run nbinit.init_notebook().
This includes
TILookup - the Threat Intel provider library
GeopIP - the Geo ip provider that you want to use
AzureData - the module used to query details about Azure resources
AzureSentinelAPI - the module used to query the Microsoft Sentinel API
Notebooklets - loads notebooklets from the msticnb package
Pivot - pivot functions
These are loaded in this order, since the Pivot component needs query and other providers loaded in order to find the pivot functions that it will attach to entities. For more information see pivot functions
Some components do not require any parameters (e.g. TILookup and Pivot). Others do support or require additional settings:
GeoIpLookup
You must type the name of the GeoIP provider that you want to use - either "GeoLiteLookup" or "IPStack"
AzureData and AzureSentinelAPI
auth_methods - override the default settings for AzureCLI and connect using the selected methods
connnect - set to false to load but not connect
Notebooklets
This has a single parameter block AzureSentinel. At minumum you should specify the workspace name. This needs to be in the following format:
WORKSPACENAME must be one of the workspaces defined in the Microsoft Sentinel tab.
You can also add addition parameters to send to the notebooklets init function: Specify these as addition key:value pairs, separated by newlines.
See the msticnb init documentation for more details
Save your file and add the MSTICPYCONFIG environment variable
Save your file, and, if you haven't yet done so, create an enviroment variable to point to it. See Setting the path to your msticpyconfig.yaml
Validating your msticpyconfig.yaml settings
MpConfigFile includes a validation function that can help you diagnose setup problems.
You can run this interactively or from Python.
The examples below assume that you have set MSTICPYCONFIG to point to you config file. If not, you will need to use the load_from_file() function (or Load File button) to load the file before validating.
To validate interactively:
------If you need to create or modify your config.json you can run the following cell.
You will need the subscription and workspace IDs for your Microsoft Sentinel Workspace. These can be found here in the Microsoft Sentinel portal as shown below.

Copy the subscription and workspace IDs:
