Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
jupyter-naas
GitHub Repository: jupyter-naas/awesome-notebooks
Path: blob/master/Faker/Faker_Anonymize_Address_from_dataframe.ipynb
2973 views
Kernel: Python 3

Faker.png

Faker - Anonymize Address from dataframe

Give Feedback | Bug report

Tags: #faker #operations #snippet #database #dataframe

Last update: 2023-04-12 (Created: 2022-09-09)

Description: This notebook provides a way to anonymize address data from a dataframe using the Faker library.

Input

Import libraries

try: from faker import Faker except: !pip install faker from faker import Faker import pandas as pd

Setup Data

data = [ {"Name": "Mike", "Address": "x", "Score": 12}, {"Name": "Peter", "Address": "z", "Score": 10}, {"Name": "Lisa", "Address": "z", "Score": 11}, ] df = pd.DataFrame(data) df

Setup Faker

Use faker.Faker() to create and initialize a faker generator, which can generate data by accessing properties named after the type of data you want.

faker = Faker() # Column to be anonymize col_name = "Address"

Model

Fake address

Through use of the .unique property on the generator, you can guarantee that any generated values are unique for this specific instance.

def fake_address(df, col_name): dict_names = {name: faker.unique.address() for name in df[col_name].unique()} df["New Address"] = df[col_name].map(dict_names) return df

Output

Display result

df_fake_address = fake_address(df, col_name) df_fake_address