Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
jupyter-naas
GitHub Repository: jupyter-naas/awesome-notebooks
Path: blob/master/Gmail/Gmail_Get_most_common_senders.ipynb
2973 views
Kernel: Python 3

Gmail.jpg

Gmail - Get most common senders

Give Feedback | Bug report

Tags: #gmail #productivity #naas_drivers #operations #automation #analytics #plotly

Last update: 2023-07-19 (Created: 2023-07-19)

Description: This notebook analyses users' inbox, identifies a list of the most common senders depending on the emails for the set period of time, and outputs the list of most common senders. This notebook aims to identify unwanted subscriptions or emails that Gmail didn't successfully filter as "Spam."

Input

Import libraries

import datetime import os from imapclient import IMAPClient import naas from collections import Counter import quopri import email.header

Setup Variables

Create an application password following this procedure

  • username: This variable stores the username or email address associated with the email account

  • password: This variable stores the password or authentication token required to access the email account

  • date_start: Number of days to filter your inbox, it must be negative value

  • most_common_senders: Number of most common senders you want to list as output

username = "xxxxx@xxxx" password = naas.secret.get("GMAIL_APP_PASSWORD") date_start = -30 most_common_senders = 10

Model

Connect to email box

server = IMAPClient('imap.gmail.com') server.login(username, password) server.select_folder('INBOX') print("✅ Successfully connected to INBOX")

Get all emails for the set period of time with their flags (seen or unseen), date, and sender

today = datetime.date.today() start = today + datetime.timedelta(days=date_start) all_messages = server.search(['SINCE', start.strftime('%d-%b-%Y')]) all_metadata = server.fetch(all_messages, ['RFC822.SIZE', 'FLAGS', 'INTERNALDATE', 'ENVELOPE']) print("✅ All emails fetched:", len(all_metadata))

Get the most common senders using the method most_common

The method most_common identifies the senders with the highest index of occurrences and outputs the sorted list in descending order
senders = [] for msg_id, data in all_metadata.items(): envelope = data[b'ENVELOPE'] if envelope.from_: sender_email = envelope.from_[0].mailbox.decode() + "@" + envelope.from_[0].host.decode() senders.append(sender_email) sender_counts = Counter(senders) top_senders = sender_counts.most_common(most_common_senders)

Output

print(f"The {most_common_senders} most common senders:") for sender, count in top_senders: print(f"{sender}: {count} emails")