Path: blob/master/AWS/AWS_Send_dataframe_to_S3.ipynb
2973 views
AWS - Send dataframe to S3
Tags: #aws #cloud #storage #S3bucket #operations #snippet #dataframe
Author: Maxime Jublou
Last update: 2023-11-20 (Created: 2022-04-28)
Description: This notebook demonstrates how to use AWS to send a dataframe to an S3 bucket.
Input
Import libraries
Setup variables
Mandatory
aws_access_key_id
: This variable is used to store the AWS access key ID.aws_secret_access_key
: This variable is used to store the AWS secret access key.bucket_path
: The name of the S3 bucket from which you want to list the files.
Model
Set environ
Get dataframe
Output
Send dataset to S3
Wrangler has 3 different write modes to store Parquet Datasets on Amazon S3.
append (Default) : Only adds new files without any delete.
overwrite : Deletes everything in the target directory and then add new files.
overwrite_partitions (Partition Upsert) : Only deletes the paths of partitions that should be updated and then writes the new partitions files. It's like a "partition Upsert".