Path: blob/master/april_18/lessons/lesson-17/code/solution-code/solution-code-17.ipynb
1905 views
Pandas can be used to connect to most relational databases. In this demonstration, we will create and connect to a SQLite database. SQLite creates portable SQL databases saved in a single file. These databases are stored in a very efficient manner and allow fast querying, making them ideal for small databases or databases that need to be moved across machines.
If you are looking to start using a database without the setup of mysql or postgres, SQLite is a good start.
Let's return to the Rossmann sales data and load that into the database.
Data is moved to the database through the to_sql command, similar to the to_csv command.
to_sql takes as arugments: - name, the table name to create - con, a connection to a database - index, whether to input the index column - schema, if we want to write a custom schema for the new table - if_exists, what to do if the table already exists. We can overwrite it, add to it, or fail
CHECK: Load the Rossmann Store metadata in
rossmann-stores.csvand create a table into the database from it
CHECK: (SELECT) Have the students write a query that returns the
Store,DateandCustomers
CHECK: (WHERE) Have the students write a query that returns the
Store,DateandCustomersfor when the stores were open and running a promotion
CHECK: (GROUP BY) Have the students write a query that returns the total sales on promotion days.
Exercises
Load the Walmart sales and store features data
Create a table for each of those datasets
Select the store, date and fuel price on days it was over 90 degrees
Select the store, date and weekly sales and temperature
What were average sales on holiday vs. non-holiday sales
What were average sales on holiday vs. non-holiday sales when the temperature was below 32 degrees