Path: blob/master/Abstract API/Abstract_API_Get_IP_Geolocation.ipynb
2973 views
Abstract API - Get IP Geolocation
Tags: #api #abstract-api #ip #geolocation #stream #multithread #queues #operations #dataprocessing #automation
Author: Maxime Jublou
Last update: 2023-04-12 (Created: 2023-02-24)
Description: This notebook provides a way to get the geolocation of an IP address using the AbstractAPI service.
Input
Import libraries
Setup Variables
Create list of IP addresses
You can replace this code to load your own dataframe. Only requirements are:
The dataframe variable must be named
df
The dataframe should have an
ip
columns containing ip addresses.
Model
Create queues
The goal is to stream events (here IP addresses), instead of batching all of them in one process.
Set timer
This is the cadence at which we will call the Abstract API.
Create job worker function
This function will be the worker taking jobs and calling the api.
Create result worker function
This function is the code that will be run by our result worker function. Its role is to take the results from all workers and store them.
Note: If we want to get a lot of jobs done then we will need to offload results to a file instead of keeping everything in memory. We could stream json representations to a file then read it later.
Create queue monitor function
This function is here to provide live progress during the execution. It can also start new workers if we are not reaching the CALL_PER_SECOND
target.
Run API calls
Output
Load all results from the RESULT_TMP_FILE
and load them in a Pandas DataFrame.