Path: blob/master/tutorials-and-examples/feature-tutorials/VirusTotalLookup.ipynb
3253 views
Title: msticpy - VirusTotal Lookup
Disclaimer and Acknowledgements:
The code in this module is offered as a convenience wrapper for the VirusTotal API based on the public documentation. The code does not originate from VirusTotal, nor is it endorsed by them. I'd like thank them for
Wonderfully clear documention and examples
Granting me extra querying capacity for my account for testing
You must have msticpy installed to run this notebook:
New Features
This is quite and old notebook and some developments largely supercede this component.
Virus Total queries have been integrated into the core TILookup functionality in MSTICPy See the TIProviders notebook and the TIProviders documentation for more details
VirusTotal V3 API - VT have release a new version of their API which allows graph traversal to get information about how malware and actors are linked. See the VT3Lookup notebook for more details
Introduction
This class allows you to submit Indicators of Compromise (IoC) to VirusTotal and receive and process the content of the response. You can submit a single item or a set of items in a column of a pandas DataFrame.
VirusTotal supports the following IoC Types:
FileHash
URL
IP Address (v4)
DNS Domain
The first two of these result in full reports of malicious content from scans. The IP Address and DNS items provide secondary lookup - e.g. if IP Address 111.222.3.5 or www.evil.net is linked to a positive (malicious) report for a URL, the latter report will be returned in the results. VT does not report directly on the reputation of IP addresses or DNS domains.
Virus Total Lookup
To use this module need an API key from virus total, which you can obtain here: https://www.virustotal.com/.
Note that VT throttles requests for free API keys to 4/minute. If you are unable to process the entire data set, try splitting it and submitting smaller chunks.
Things to note:
Virus Total lookups include file hashes, domains, IP addresses and URLs.
The returned data is slightly different depending on the input type
The VTLookup class tries to screen input data to prevent pointless lookups. E.g.:
Only public IP Addresses will be submitted (no loopback, private address space, etc.)
URLs with only local (unqualified) host parts will not be submitted.
Domain names that are unqualified will not be submitted.
Hash-like strings (e.g 'AAAAAAAAAAAAAAAAAA') that do not appear to have enough entropy to be a hash will not be submitted.
If submitted in a batch (i.e. using a DataFrame as input) duplicate IoCs are not submitted. Duplicates will be given the results from the original lookip
You will need a VirusTotal API key
You will get more detailed results if you have a private API key but you can get a lot of good information using the public API and a free API key. You are however limited in the number of requests you can make.
DataFrame output can be a cleaner than a dict
Note that re-using the same class for multiple lookups accumulates the results in the the class results DataFrame
Interpreting the Output
Columns in the output dataframe are as follows:
Observable - The IoC observable submitted
IoCType - the IoC type
Status - the status of the submission request
ResponseCode - the VT response code
RawResponse - the entire raw json response
Resource - VT Resource
SourceIndex - The index of the Observable in the source DataFrame. You can use this to rejoin to your original data.
VerboseMsg - VT Verbose Message
ScanId - VT Scan ID if any
Permalink - VT Permanent URL describing the resource
Positives - If this is not zero, it indicates the number of malicious reports that VT holds for this observable.
MD5 - The MD5 hash, if any
SHA1 - The MD5 hash, if any
SHA256 - The MD5 hash, if any
ResolvedDomains - In the case of IP Addresses, this contains a list of all domains that resolve to this IP address
ResolvedIPs - In the case Domains, this contains a list of all IP addresses resolved from the domain.
DetectedUrls - Any malicious URLs associated with the observable.
IoC Types Available
There are 4 basic IoC types used by Virus Total. Hashes of all types (include SHA256 Authenticode) are covered by the 'file' type.
Input from a DataFrame
WARNING The VirusTotal Public API allows a maximum of 4 requests a minute. If you start seeing HTTP Error 403, you've probably hit this limit
API Signature
Load test data and extract some IoCs from it
Submit these to VirusTotal
Note that most of these the IoC observables found by a simple regex extraction were rejected before submitting to VT. As well as checking for duplicates this module also filters out things like
loopback/private IPs
unqualified and unresolvable domain names
strings of hex characters that are probably not hashes