Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
jupyter-naas
GitHub Repository: jupyter-naas/awesome-notebooks
Path: blob/master/Faker/Faker_Anonymize_Personal_Names_from_dataframe.ipynb
2973 views
Kernel: Python 3

Faker.png

Faker - Anonymize Personal Names from dataframe

Give Feedback | Bug report

Tags: #faker #operations #snippet #database #dataframe

Last update: 2023-04-12 (Created: 2022-09-09)

Description: This notebook provides a way to anonymize personal names from a dataframe using the Faker library.

Input

Import libraries

try: from faker import Faker except: !pip install faker from faker import Faker import pandas as pd

Setup Data

data = [ {"Name": "Marie", "Score": 12}, {"Name": "Marie", "Score": 10}, {"Name": "Mariana", "Score": 11}, ] df = pd.DataFrame(data) df

Setup Faker

Use faker.Faker() to create and initialize a faker generator, which can generate data by accessing properties named after the type of data you want.

faker = Faker() # Column to be anonymize col_name = "Name"

Model

Fake names

Through use of the .unique property on the generator, you can guarantee that any generated values are unique for this specific instance.

def fake_names(df, col_name): dict_names = {name: faker.unique.name() for name in df[col_name].unique()} df["New Name"] = df[col_name].map(dict_names) return df

Output

Display result

df_fake_names = fake_names(df, col_name) df_fake_names