Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Azure
GitHub Repository: Azure/Azure-Sentinel-Notebooks
Path: blob/master/tutorials-and-examples/feature-tutorials/GeoIPLookups.ipynb
3253 views
Kernel: Python (condadev)

Title: msticpy - GeoIP Lookup

Introduction

This module contains two classes that allow you to look up the Geolocation of IP Addresses.

You must have msticpy installed to run this notebook:

%pip install --upgrade msticpy

MaxMind GeoIPLite

This product includes GeoLite2 data created by MaxMind, available from https://www.maxmind.com.

This uses a local database which is downloaded first time when class object is instantiated. It gives very fast lookups but you need to download updates regularly. Maxmind offers a free tier of this database, updated monthly. For greater accuracy and more detailed information they have varying levels of paid service. Please check out their site for more details.

The geoip module uses official maxmind pypi package - geoip2 and also has options to customize the behavior of local maxmind database.

  • db_folder : Specify custom path containing local maxmind city database. If not specified, download to .msticpy dir under user`s home dir.

  • force_update : can be set to True/False to issue force update despite of age check.

  • Check age of maxmind city database based on database info and download new if it is not updated in last 30 days.

  • ``auto_update``` : can be set to True/False Allow option to override auto update database if user is desired not to update database older than 30 days.

IPStack

This library uses services provided by ipstack. https://ipstack.com

IPStack is an online service and also offers a free tier of their service. Again, the paid tiers offer greater accuracy, more detailed information and higher throughput. Please check out their site for more details.

# Imports import sys MIN_REQ_PYTHON = (3,6) if sys.version_info < MIN_REQ_PYTHON: print('Check the Kernel->Change Kernel menu and ensure that Python 3.6') print('or later is selected as the active kernel.') sys.exit("Python %s.%s or later is required.\n" % MIN_REQ_PYTHON) from IPython.display import display import pandas as pd import msticpy.sectools as sectools from msticpy.nbtools import * from msticpy.nbtools.entityschema import IpAddress, GeoLocation from msticpy.sectools.geoip import GeoLiteLookup, IPStackLookup

Contents

Maxmind GeoIP Lite Lookup Class

Signature:

iplocation.lookup_ip(ip_address: str = None, ip_addr_list: collections.abc.Iterable = None, ip_entity: msticpy.nbtools.entityschema.IpAddress = None) Docstring: Lookup IP location from GeoLite2 data created by MaxMind. Keyword Arguments: ip_address {str} -- a single address to look up (default: {None}) ip_addr_list {Iterable} -- a collection of addresses to lookup (default: {None}) ip_entity {IpAddress} -- an IpAddress entity Returns: tuple(list{dict}, list{entity}) -- returns raw geolocation results and same results as IP/Geolocation entities
iplocation = GeoLiteLookup() loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97') print('Raw result') display(loc_result) print('IP Address Entity') display(ip_entity[0])
Raw result
[{'continent': {'code': 'EU', 'geoname_id': 6255148, 'names': {'de': 'Europa', 'en': 'Europe', 'es': 'Europa', 'fr': 'Europe', 'ja': 'ヨーロッパ', 'pt-BR': 'Europa', 'ru': 'Европа', 'zh-CN': '欧洲'}}, 'country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'location': {'accuracy_radius': 1000, 'latitude': 55.7386, 'longitude': 37.6068, 'time_zone': 'Europe/Moscow'}, 'registered_country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 17}}]
IP Address Entity
import tempfile from pathlib import Path tmp_folder = tempfile.gettempdir() iplocation = GeoLiteLookup(db_folder=str(Path(tmp_folder).joinpath('geolite'))) loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97') print('Raw result') display(loc_result) print('IP Address Entity') display(ip_entity[0])
Raw result
[{'continent': {'code': 'EU', 'geoname_id': 6255148, 'names': {'de': 'Europa', 'en': 'Europe', 'es': 'Europa', 'fr': 'Europe', 'ja': 'ヨーロッパ', 'pt-BR': 'Europa', 'ru': 'Европа', 'zh-CN': '欧洲'}}, 'country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'location': {'accuracy_radius': 1000, 'latitude': 55.7386, 'longitude': 37.6068, 'time_zone': 'Europe/Moscow'}, 'registered_country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 17}}]
IP Address Entity
iplocation = GeoLiteLookup(force_update=True) loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97') print('Raw result') display(loc_result) print('IP Address Entity') display(ip_entity[0])
force_update is set to True. Attempting to download new database to C:\Users\Ian\.msticpy\GeoLite2 Downloading and extracting GeoLite DB archive from MaxMind.... Raw result
e:\src\microsoft\msticpy\msticpy\sectools\geoip.py:609: UserWarning: Error writing GeoIP DB file: C:\Users\Ian\.msticpy\GeoLite2\GeoLite2-City.mmdb - [Errno 22] Invalid argument: 'C:\\Users\\Ian\\.msticpy\\GeoLite2\\GeoLite2-City.mmdb' warnings.warn(f"Error writing GeoIP DB file: {db_file_path} - {err}") e:\src\microsoft\msticpy\msticpy\sectools\geoip.py:536: UserWarning: DB download failed warnings.warn("DB download failed") e:\src\microsoft\msticpy\msticpy\sectools\geoip.py:540: UserWarning: Continuing with cached database. Results may inaccurate. "Continuing with cached database. Results may inaccurate."
[{'continent': {'code': 'EU', 'geoname_id': 6255148, 'names': {'de': 'Europa', 'en': 'Europe', 'es': 'Europa', 'fr': 'Europe', 'ja': 'ヨーロッパ', 'pt-BR': 'Europa', 'ru': 'Европа', 'zh-CN': '欧洲'}}, 'country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'location': {'accuracy_radius': 1000, 'latitude': 55.7386, 'longitude': 37.6068, 'time_zone': 'Europe/Moscow'}, 'registered_country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 17}}]
IP Address Entity
iplocation = GeoLiteLookup(auto_update=False) loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97') print('Raw result') display(loc_result) print('IP Address Entity') display(ip_entity[0])
Raw result
[{'continent': {'code': 'EU', 'geoname_id': 6255148, 'names': {'de': 'Europa', 'en': 'Europe', 'es': 'Europa', 'fr': 'Europe', 'ja': 'ヨーロッパ', 'pt-BR': 'Europa', 'ru': 'Европа', 'zh-CN': '欧洲'}}, 'country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'location': {'accuracy_radius': 1000, 'latitude': 55.7386, 'longitude': 37.6068, 'time_zone': 'Europe/Moscow'}, 'registered_country': {'geoname_id': 2017370, 'iso_code': 'RU', 'names': {'de': 'Russland', 'en': 'Russia', 'es': 'Rusia', 'fr': 'Russie', 'ja': 'ロシア', 'pt-BR': 'Rússia', 'ru': 'Россия', 'zh-CN': '俄罗斯联邦'}}, 'traits': {'ip_address': '90.156.201.97', 'prefix_len': 17}}]
IP Address Entity
import socket socket_info = socket.getaddrinfo("pypi.org",0,0,0,0) ips = [res[4][0] for res in socket_info] print(ips) _, ip_entities = iplocation.lookup_ip(ip_addr_list=ips) display(ip_entities)
['151.101.128.223', '151.101.192.223', '151.101.0.223', '151.101.64.223']
[IpAddress(Address=151.101.128.223, Location={ 'AdditionalData': {}, 'CountryCode': 'US',...), IpAddress(Address=151.101.192.223, Location={ 'AdditionalData': {}, 'CountryCode': 'US',...), IpAddress(Address=151.101.0.223, Location={ 'AdditionalData': {}, 'CountryCode': 'US', ...), IpAddress(Address=151.101.64.223, Location={ 'AdditionalData': {}, 'CountryCode': 'US', ...)]

Contents

IPStack Geo-lookup Class

Class Initialization

Note - requires IPStack API Key, Optional parameter bulk_lookup allows multiple IPs in a single request. This is only available with the paid Professional tier and above.

Init signature: IPStackLookup(api_key: str, bulk_lookup: bool = False) Docstring: GeoIP Lookup using IPStack web service. Raises: ConnectionError -- Invalid status returned from http request PermissionError -- Service refused request (e.g. requesting batch of addresses on free tier API key) Init docstring: Create a new instance of IPStackLookup. Arguments: api_key {str} -- API Key from IPStack - see https://ipstack.com bulk_lookup {bool} -- For Professional and above tiers allowing you to submit multiple IPs in a single request.

lookup_ip method

Signature: iplocation.lookup_ip( ['ip_address: str = None', 'ip_addr_list: collections.abc.Iterable = None', 'ip_entity: msticpy.nbtools.entityschema.IpAddress = None'], ) -&gt; tuple Docstring: Lookup IP location from IPStack web service. Keyword Arguments: ip_address {str} -- a single address to look up (default: {None}) ip_addr_list {Iterable} -- a collection of addresses to lookup (default: {None}) ip_entity {IpAddress} -- an IpAddress entity Raises: ConnectionError -- Invalid status returned from http request PermissionError -- Service refused request (e.g. requesting batch of addresses on free tier API key) Returns: tuple(list{dict}, list{entity}) -- returns raw geolocation results and same results as IP/Geolocation entities

Contents

You will need a IPStack API key

You will get more detailed results and a higher throughput allowance if you have a paid tier. See IPStack website for more details

iplocation = IPStackLookup() # Enter your IPStack Key here (if not set in msticpyconfig.yaml) ips_key = nbwidgets.GetEnvironmentKey(env_var='IPSTACK_AUTH', help_str='To obtain an API key sign up here https://www.ipstack.com/', prompt='IPStack API key:') if not iplocation.settings.args.get("AuthKey"): ips_key.display()
HTML(value='To obtain an API key sign up here https://www.ipstack.com/')
if not iplocation.settings.args.get("AuthKey") and not ips_key.value: raise ValueError("No Authentication key in config/environment or supplied by user.") if ips_key.value: iplocation = IPStackLookup(api_key=ips_key.value) loc_result, ip_entity = iplocation.lookup_ip(ip_address='90.156.201.97') print('Raw result') display(loc_result) print('IP Address Entity') display(ip_entity[0])
Raw result
[({'ip': '90.156.201.97', 'type': 'ipv4', 'continent_code': 'AS', 'continent_name': 'Asia', 'country_code': 'RU', 'country_name': 'Russia', 'region_code': 'MOW', 'region_name': 'Moscow', 'city': 'Moscow', 'zip': '115088', 'latitude': 55.712608337402344, 'longitude': 37.68056869506836, 'location': {'geoname_id': 524901, 'capital': 'Moscow', 'languages': [{'code': 'ru', 'name': 'Russian', 'native': 'Русский'}], 'country_flag': 'http://assets.ipstack.com/flags/ru.svg', 'country_flag_emoji': '🇷🇺', 'country_flag_emoji_unicode': 'U+1F1F7 U+1F1FA', 'calling_code': '7', 'is_eu': False}}, 200)]
IP Address Entity
loc_result, ip_entities = iplocation.lookup_ip(ip_addr_list=ips) print('Raw results') display(loc_result) print('IP Address Entities') display(ip_entities)
Raw results
[({'ip': '151.101.128.223', 'type': 'ipv4', 'continent_code': 'NA', 'continent_name': 'North America', 'country_code': 'US', 'country_name': 'United States', 'region_code': 'CA', 'region_name': 'California', 'city': 'San Francisco', 'zip': '94107', 'latitude': 37.76784896850586, 'longitude': -122.39286041259766, 'location': {'geoname_id': 5391959, 'capital': 'Washington D.C.', 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}], 'country_flag': 'http://assets.ipstack.com/flags/us.svg', 'country_flag_emoji': '🇺🇸', 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8', 'calling_code': '1', 'is_eu': False}}, 200), ({'ip': '151.101.192.223', 'type': 'ipv4', 'continent_code': 'NA', 'continent_name': 'North America', 'country_code': 'US', 'country_name': 'United States', 'region_code': 'CA', 'region_name': 'California', 'city': 'San Francisco', 'zip': '94107', 'latitude': 37.76784896850586, 'longitude': -122.39286041259766, 'location': {'geoname_id': 5391959, 'capital': 'Washington D.C.', 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}], 'country_flag': 'http://assets.ipstack.com/flags/us.svg', 'country_flag_emoji': '🇺🇸', 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8', 'calling_code': '1', 'is_eu': False}}, 200), ({'ip': '151.101.0.223', 'type': 'ipv4', 'continent_code': 'NA', 'continent_name': 'North America', 'country_code': 'US', 'country_name': 'United States', 'region_code': 'CA', 'region_name': 'California', 'city': 'San Francisco', 'zip': '94107', 'latitude': 37.76784896850586, 'longitude': -122.39286041259766, 'location': {'geoname_id': 5391959, 'capital': 'Washington D.C.', 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}], 'country_flag': 'http://assets.ipstack.com/flags/us.svg', 'country_flag_emoji': '🇺🇸', 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8', 'calling_code': '1', 'is_eu': False}}, 200), ({'ip': '151.101.64.223', 'type': 'ipv4', 'continent_code': 'NA', 'continent_name': 'North America', 'country_code': 'US', 'country_name': 'United States', 'region_code': 'CA', 'region_name': 'California', 'city': 'San Francisco', 'zip': '94107', 'latitude': 37.76784896850586, 'longitude': -122.39286041259766, 'location': {'geoname_id': 5391959, 'capital': 'Washington D.C.', 'languages': [{'code': 'en', 'name': 'English', 'native': 'English'}], 'country_flag': 'http://assets.ipstack.com/flags/us.svg', 'country_flag_emoji': '🇺🇸', 'country_flag_emoji_unicode': 'U+1F1FA U+1F1F8', 'calling_code': '1', 'is_eu': False}}, 200)]
IP Address Entities
[IpAddress(Address=151.101.128.223, Location={ 'AdditionalData': {}, 'City': 'San Francis...), IpAddress(Address=151.101.192.223, Location={ 'AdditionalData': {}, 'City': 'San Francis...), IpAddress(Address=151.101.0.223, Location={ 'AdditionalData': {}, 'City': 'San Francisco...), IpAddress(Address=151.101.64.223, Location={ 'AdditionalData': {}, 'City': 'San Francisc...)]

Contents

Taking input from a pandas DataFrame

The base class for both implementations has a method that sources the ip addresses from a dataframe column and returns a new dataframe with the location information merged with the input frame

Signature: iplocation.df_lookup_ip(data: pandas.core.frame.DataFrame, column: str) Docstring: Lookup Geolocation data from a pandas Dataframe. Keyword Arguments: data {pd.DataFrame} -- pandas dataframe containing IpAddress column column {str} -- the name of the dataframe column to use as a source
import pandas as pd netflow_df = pd.read_csv("data/az_net_flows.csv") netflow_df = netflow_df[["AllExtIPs"]].drop_duplicates() iplocation = GeoLiteLookup() iplocation.df_lookup_ip(netflow_df, column="AllExtIPs")

Contents

Creating a Custom GeopIP Lookup Class

You can derive a class that implements the same operations to use with a different GeoIP service.

The class signature is as follows:

class GeoIpLookup(ABC): """Abstract base class for GeoIP Lookup classes.""" @abstractmethod def lookup_ip(self, ip_address: str = None, ip_addr_list: Iterable = None, ip_entity: IpAddress = None): """ Lookup IP location. Keyword Arguments: ip_address {str} -- a single address to look up (default: {None}) ip_addr_list {Iterable} -- a collection of addresses to lookup (default: {None}) ip_entity {IpAddress} -- an IpAddress entity Returns: tuple(list{dict}, list{entity}) -- returns raw geolocation results and same results as IP/Geolocation entities """

You should override the lookup_ip method implementing your own method of geoip lookup.

Contents

Calculating Geographical Distances

Use the geo_distance function from msticpy.sectools.geoip to calculated distances between two locations. I am indebted to Martin Thoma who posted this solution (which I've modified slightly) on Stackoverflow.

Signature: geo_distance(origin: Tuple[float, float], destination: Tuple[float, float]) -&gt; float Docstring: Calculate the Haversine distance. Author: Martin Thoma - stackoverflow Parameters ---------- origin : tuple of float (lat, long) destination : tuple of float (lat, long) Returns ------- distance_in_km : float

Or where you have source and destination IpAddress entities, you can use the wrapper entity_distance.

Signature: entity_distance( ['ip_src: msticpy.nbtools.entityschema.IpAddress', 'ip_dest: msticpy.nbtools.entityschema.IpAddress'], ) -&gt; float Docstring: Return distance between two IP Entities. Arguments: ip_src {IpAddress} -- Source IpAddress Entity ip_dest {IpAddress} -- Destination IpAddress Entity Raises: AttributeError -- if either entity has no location information Returns: float -- Distance in kilometers.
from msticpy.sectools.geoip import geo_distance _, ip_entity1 = iplocation.lookup_ip(ip_address='90.156.201.97') _, ip_entity2 = iplocation.lookup_ip(ip_address='151.101.64.223') print(ip_entity1[0]) print(ip_entity2[0]) dist = geo_distance(origin=(ip_entity1[0].Location.Latitude, ip_entity1[0].Location.Longitude), destination=(ip_entity2[0].Location.Latitude, ip_entity2[0].Location.Longitude)) print(f'\nDistance between IP Locations = {round(dist, 1)}km')
{ 'AdditionalData': {}, 'Address': '90.156.201.97', 'Location': { 'AdditionalData': {}, 'CountryCode': 'RU', 'CountryName': 'Russia', 'Latitude': 55.7386, 'Longitude': 37.6068, 'Type': 'geolocation', 'edges': set()}, 'ThreatIntelligence': [], 'Type': 'ipaddress', 'edges': set()} { 'AdditionalData': {}, 'Address': '151.101.64.223', 'Location': { 'AdditionalData': {}, 'CountryCode': 'US', 'CountryName': 'United States', 'Latitude': 37.751, 'Longitude': -97.822, 'Type': 'geolocation', 'edges': set()}, 'ThreatIntelligence': [], 'Type': 'ipaddress', 'edges': set()} Distance between IP Locations = 8796.8km