Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Azure
GitHub Repository: Azure/Azure-Sentinel-Notebooks
Path: blob/master/tutorials-and-examples/feature-tutorials/DataViewer.ipynb
3253 views
Kernel: Python (condadev)

Data Viewer

This notebook demonstrates the use of the DataViewer control.

It provides some basic features that let you browse pandas DataFrames more easily:

  • Scrollable data viewer taking fixed amount of output cell space

  • Sorting data by column

  • Column selection

  • Data filtering

Read in some data to demonstrate

from msticpy.nbtools.data_viewer import DataViewer import pandas as pd data = pd.read_csv( "./data/processes_on_host.csv", index_col=0, parse_dates=["TimeGenerated"], infer_datetime_format=True, )

Use the DataViewer to display a DataFrame

DataViewer(data)
MIME type unknown not supported
MIME type unknown not supported
Accordion(children=(VBox(children=(VBox(children=(Text(value='', description='Filter:', style=DescriptionStyle…

Specify an initial set of columns

columns = [ "Account", "EventID", "TimeGenerated", "Computer", "NewProcessName", "CommandLine", "ParentProcessName", ] DataViewer(data, selected_cols=columns)
MIME type unknown not supported
MIME type unknown not supported
Accordion(children=(VBox(children=(VBox(children=(Text(value='', description='Filter:', style=DescriptionStyle…

Use "Choose columns" to select which columns to display

The right side list contains the available columns in the DataFrame, the left side is the list of columns to display.

Use the Add/Remove buttons to add or remove columns from the selected set. You can select multiple columns using Ctrl+Click or Shift+Click (the former selects or deselects an item for each click, the latter selects a range of items between the last item selected and the currently-clicked item).

Click on Apply columns to update the data view.

viewer = DataViewer(data, selected_cols=columns) # We're opening the "Choose columns" drop-down programmatically # Just click on the small arrow to the left of "Choose columns" to open this viewer.accordion.selected_index = 0 viewer
MIME type unknown not supported
MIME type unknown not supported
Accordion(children=(VBox(children=(VBox(children=(Text(value='', description='Filter:', style=DescriptionStyle…

Filtering the data

You can apply multiple filters - each filter is additive, i.e. each is logically ANDed with the others.

The "Filter data" drop down shows the following controls:

Filter expression editor

  • Column selector drop-down - which column you want the filter to apply to

  • Not checkbox - invert the logic of the filter (for this filter item only)

  • Operator drop-down - the available operators are different for string and non-string (numeric and dates)

  • Expression text box - type in the expression that you want to match

  • Add filter - adds the current filter items as a new filter expression to Current filters

  • Update filter - overwrites the selected filter in Current filters with the current filter expression

Current filters

  • Select the filter expression you want to operate on from the Filters list

  • Delete filter deletes the selected item

  • Clear all filters removes all filter expressions

  • Apply filter - applies the filter items to the data and updates the display

viewer = DataViewer(data, selected_cols=columns) # manually add a filter sample_filter = { "ParentProcessName contains 'cmd'": ("ParentProcessName", False, "contains", "cmd"), "CommandLine contains 'script'": ("CommandLine", False, "contains", "script"), } viewer.import_filters(sample_filter) # We're opening the "Filter data" drop-down programmatically # Just click on the small arrow to the left of "Filter data" to open this viewer.accordion.selected_index = 1 viewer
MIME type unknown not supported
MIME type unknown not supported
Accordion(children=(VBox(children=(VBox(children=(Text(value='', description='Filter:', style=DescriptionStyle…
viewer.filters
{"ParentProcessName contains 'cmd'": FilterExpr(column='ParentProcessName', inv=False, operator='contains', expr='cmd'), "CommandLine contains 'script'": FilterExpr(column='CommandLine', inv=False, operator='contains', expr='script')}

Advanced querying with filter query operator

The query operator lets you type in a pandas query expression.

Note, the selected column is not relevant for this operator since you specify the column name
within the query expression. You can select any column name.

See this documentation for the syntax of the pandas query method

viewer = DataViewer(data, selected_cols=columns) sample_q_filter = { "EventID query 'ParentProcessName.str.contains('cmd') and (CommandLine.str.contains('cacls') or CommandLine.str.contains('script'))'": ( "EventID", False, "query", "ParentProcessName.str.contains('cmd') and (CommandLine.str.contains('cacls') or CommandLine.str.contains('script'))", ) } viewer.import_filters(sample_q_filter) # We're opening the "Choose columns" drop-down programmatically # Just click on the small arrow to the left of "Choose columns" to open this viewer.accordion.selected_index = 1 viewer
MIME type unknown not supported
MIME type unknown not supported
Accordion(children=(VBox(children=(VBox(children=(Text(value='', description='Filter:', style=DescriptionStyle…

Accessing the filtered data

Use the filtered_data property of the DataViewer to retrieve a DataFrame corresponding to the current column and row filtering.

Note column sorting is not captured in this data.

viewer.filtered_data

DataViewer Help

help(DataViewer)
Help on class DataViewer in module msticpy.nbtools.data_viewer: class DataViewer(builtins.object) | DataViewer(data: pandas.core.frame.DataFrame, selected_cols: List[str] = None, debug=False) | | Data viewer class. | | Methods defined here: | | __init__(self, data: pandas.core.frame.DataFrame, selected_cols: List[str] = None, debug=False) | Initialize the DataViewer class. | | Parameters | ---------- | data : pd.DataFrame | The DataFrame to view | selected_cols : List[str], optional | Initial subset of columns to show, by default None (all cols) | debug : bool | Output additional debugging info to std out. | | display(self) | Display the widget. | | import_filters(self, filters: Dict[str, msticpy.nbtools.data_viewer.FilterExpr]) | Import filter set replacing current filters. | | Parameters | ---------- | filters : Dict[str, FilterExpr] | dict of filter name, FilterExpr | FilterExpr is a tuple of: | column [str], inv [bool], operator [str], expr [str] | | show(self) | Display the data table control. | | ---------------------------------------------------------------------- | Data descriptors defined here: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | filtered_data | Return filtered dataframe. | | filters | Return current filters as a dict.
import tabulate print(tabulate.tabulate(viewer.filtered_data, tablefmt="rst", showindex=False, headers="keys"))
========================== ========= ========================== =============== =================================== ===================================================================================================================================================================================================================================== =========================== Account EventID TimeGenerated Computer NewProcessName CommandLine ParentProcessName ========================== ========= ========================== =============== =================================== ===================================================================================================================================================================================================================================== =========================== MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:16.663000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\rundll32.exe .\rundll32.exe /C mshtml,RunHTMLApplication javascript:alert(tada!) C:\Windows\System32\cmd.exe MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:16.020000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\cmd.exe cmd /c C:\Windows\System32\mshta.exe vbscript:CreateObject("Wscript.Shell").Run(".\powershell.exe -c ""$x=$((gp HKLM:Software\Microsoft\Windows\CurrentVersion Certificate).Certificate);.\powershell -E $y""",0,True)(window.close) C:\Windows\System32\cmd.exe MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:18.080000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\wuauclt.exe .\wuauclt.exe /C "c:\windows\softwaredistribution\cscript.exe" C:\Windows\System32\cmd.exe MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:18.287000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\lsass.exe .\lsass.exe /C "c:\windows\softwaredistribution\cscript.exe" C:\Windows\System32\cmd.exe MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:18.337000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\cmd.exe cmd /c "powershell wscript.shell used to download a .gif" C:\Windows\System32\cmd.exe MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:18.403000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\cacls.exe cacls.exe c:\windows\system32\wscript.exe /e /t /g everyone:f C:\Windows\System32\cmd.exe MSTICAlertsWin1\MSTICAdmin 4688 2019-01-15 05:15:18.820000 MSTICAlertsWin1 C:\Diagnostics\UserTmp\cmd.exe cmd /c echo /e:vbscript.encode /b C:\Windows\System32\cmd.exe ========================== ========= ========================== =============== =================================== ===================================================================================================================================================================================================================================== ===========================