CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutSign UpSign In
DanielBarnes18

Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.

GitHub Repository: DanielBarnes18/IBM-Data-Science-Professional-Certificate
Path: blob/main/10. Applied Data Science Capstone/03. Exploratory Data Analysis/03. Exploratory Data Analysis - SQL.ipynb
Views: 4598
Kernel: Python 3.8
cognitiveclass.ai logo

Assignment: SQL Notebook for Peer Assignment

Estimated time needed: 60 minutes.

Introduction

Using this Python notebook you will:

  1. Understand the Spacex DataSet

  2. Load the dataset into the corresponding table in a Db2 database

  3. Execute SQL queries to answer assignment questions

Overview of the DataSet

SpaceX has gained worldwide attention for a series of historic milestones.

It is the only private company ever to return a spacecraft from low-earth orbit, which it first accomplished in December 2010. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars wheras other providers cost upward of 165 million dollars each, much of the savings is because Space X can reuse the first stage.

Therefore if we can determine if the first stage will land, we can determine the cost of a launch.

This information can be used if an alternate company wants to bid against SpaceX for a rocket launch.

This dataset includes a record for each payload carried during a SpaceX mission into outer space.

Download the datasets

This assignment requires you to load the spacex dataset.

In many cases the dataset to be analyzed is available as a .CSV (comma separated values) file, perhaps on the internet. Click on the link below to download and save the dataset (.CSV file):

Spacex DataSet

Store the dataset in database table

it is highly recommended to manually load the table using the database console LOAD tool in DB2.

Now open the Db2 console, open the LOAD tool, Select / Drag the .CSV file for the dataset, Next create a New Table, and then follow the steps on-screen instructions to load the data. Name the new table as follows:

SPACEXDATASET

Follow these steps while using old DB2 UI which is having Open Console Screen

Note:While loading Spacex dataset, ensure that detect datatypes is disabled. Later click on the pencil icon(edit option).

  1. Change the Date Format by manually typing DD-MM-YYYY and timestamp format as DD-MM-YYYY HH:MM:SS.

    Here you should place the cursor at Date field and manually type as DD-MM-YYYY.

  2. Change the PAYLOAD_MASS__KG_ datatype to INTEGER.

Changes to be considered when having DB2 instance with the new UI having Go to UI screen

  • Refer to this insruction in this link for viewing the new Go to UI screen.

  • Later click on Data link(below SQL) in the Go to UI screen and click on Load Data tab.

  • Later browse for the downloaded spacex file.

  • Once done select the schema andload the file.

!pip install sqlalchemy==1.3.9 !pip install ibm_db_sa !pip install ipython-sql
Collecting sqlalchemy==1.3.9 Downloading SQLAlchemy-1.3.9.tar.gz (6.0 MB) ---------------------------------------- 6.0/6.0 MB 5.4 MB/s eta 0:00:00 Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'done' Building wheels for collected packages: sqlalchemy Building wheel for sqlalchemy (setup.py): started Building wheel for sqlalchemy (setup.py): finished with status 'done' Created wheel for sqlalchemy: filename=SQLAlchemy-1.3.9-cp37-cp37m-win_amd64.whl size=1163225 sha256=a641583177db7a3e49b1b93e5f0030cf702a80465be94086dfefb7a273c4b277 Stored in directory: c:\users\dan\appdata\local\pip\cache\wheels\03\71\13\010faf12246f72dc76b4150e6e599d13a85b4435e06fb9e51f Successfully built sqlalchemy Installing collected packages: sqlalchemy Attempting uninstall: sqlalchemy Found existing installation: SQLAlchemy 1.3.13 Uninstalling SQLAlchemy-1.3.13: Successfully uninstalled SQLAlchemy-1.3.13 Successfully installed sqlalchemy-1.3.9
WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: You are using pip version 22.0.3; however, version 22.0.4 is available. You should consider upgrading via the 'C:\Users\Dan\anaconda3\python.exe -m pip install --upgrade pip' command.
Collecting ibm_db_sa Downloading ibm_db_sa-0.3.7.tar.gz (30 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'done' Requirement already satisfied: sqlalchemy>=0.7.3 in c:\users\dan\anaconda3\lib\site-packages (from ibm_db_sa) (1.3.9) Collecting ibm_db>=2.0.0 Downloading ibm_db-3.1.1.tar.gz (1.1 MB) ---------------------------------------- 1.1/1.1 MB 3.7 MB/s eta 0:00:00 Installing build dependencies: started Installing build dependencies: finished with status 'done' Getting requirements to build wheel: started Getting requirements to build wheel: still running... Getting requirements to build wheel: finished with status 'done' Installing backend dependencies: started Installing backend dependencies: finished with status 'done' Preparing metadata (pyproject.toml): started Preparing metadata (pyproject.toml): finished with status 'done' Building wheels for collected packages: ibm_db_sa, ibm_db Building wheel for ibm_db_sa (setup.py): started Building wheel for ibm_db_sa (setup.py): finished with status 'done' Created wheel for ibm_db_sa: filename=ibm_db_sa-0.3.7-py3-none-any.whl size=29309 sha256=f11e3ffd9eb03e4fe8c28017a8b4f88436b038b8e6c06f52b5321c4b46b8111c Stored in directory: c:\users\dan\appdata\local\pip\cache\wheels\4a\e9\e7\0ee334a6cb2f09ec45978e05837b66f59494b482ef38c7ae33 Building wheel for ibm_db (pyproject.toml): started Building wheel for ibm_db (pyproject.toml): finished with status 'done' Created wheel for ibm_db: filename=ibm_db-3.1.1-py3-none-any.whl size=29942326 sha256=ae3c4fadfcfb5e29c3f64e66a42bc636b77f2a89e2e1113ab2683d42ad334fec Stored in directory: c:\users\dan\appdata\local\pip\cache\wheels\03\a2\bf\27e4a4c9ca57eeea66526c74386df7dfd118d689ab794f1ab6 Successfully built ibm_db_sa ibm_db Installing collected packages: ibm_db, ibm_db_sa Successfully installed ibm_db-3.1.1 ibm_db_sa-0.3.7
WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: You are using pip version 22.0.3; however, version 22.0.4 is available. You should consider upgrading via the 'C:\Users\Dan\anaconda3\python.exe -m pip install --upgrade pip' command.
Collecting ipython-sql Downloading ipython_sql-0.4.0-py3-none-any.whl (19 kB) Collecting prettytable<1 Downloading prettytable-0.7.2.zip (28 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'done' Requirement already satisfied: ipython-genutils>=0.1.0 in c:\users\dan\anaconda3\lib\site-packages (from ipython-sql) (0.2.0) Collecting sqlparse Downloading sqlparse-0.4.2-py3-none-any.whl (42 kB) ---------------------------------------- 42.3/42.3 KB 2.0 MB/s eta 0:00:00 Requirement already satisfied: six in c:\users\dan\anaconda3\lib\site-packages (from ipython-sql) (1.14.0) Requirement already satisfied: sqlalchemy>=0.6.7 in c:\users\dan\anaconda3\lib\site-packages (from ipython-sql) (1.3.9) Requirement already satisfied: ipython>=1.0 in c:\users\dan\anaconda3\lib\site-packages (from ipython-sql) (7.12.0) Requirement already satisfied: setuptools>=18.5 in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (45.2.0.post20200210) Requirement already satisfied: decorator in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (4.4.1) Requirement already satisfied: colorama in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (0.4.3) Requirement already satisfied: traitlets>=4.2 in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (4.3.3) Requirement already satisfied: pickleshare in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (0.7.5) Requirement already satisfied: pygments in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (2.5.2) Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (3.0.3) Requirement already satisfied: backcall in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (0.1.0) Requirement already satisfied: jedi>=0.10 in c:\users\dan\anaconda3\lib\site-packages (from ipython>=1.0->ipython-sql) (0.14.1) Requirement already satisfied: parso>=0.5.0 in c:\users\dan\anaconda3\lib\site-packages (from jedi>=0.10->ipython>=1.0->ipython-sql) (0.5.2) Requirement already satisfied: wcwidth in c:\users\dan\anaconda3\lib\site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=1.0->ipython-sql) (0.1.8) Building wheels for collected packages: prettytable Building wheel for prettytable (setup.py): started Building wheel for prettytable (setup.py): finished with status 'done' Created wheel for prettytable: filename=prettytable-0.7.2-py3-none-any.whl size=13704 sha256=9cf149258413d7cda1ee63c72f5177ebba81ca83978c7f1d9b79d97fca284d74 Stored in directory: c:\users\dan\appdata\local\pip\cache\wheels\b2\7f\f6\f180315b584f00445045ff1699b550fa895d09471337ce21c6 Successfully built prettytable Installing collected packages: prettytable, sqlparse, ipython-sql Successfully installed ipython-sql-0.4.0 prettytable-0.7.2 sqlparse-0.4.2
WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: Ignoring invalid distribution -umpy (c:\users\dan\anaconda3\lib\site-packages) WARNING: You are using pip version 22.0.3; however, version 22.0.4 is available. You should consider upgrading via the 'C:\Users\Dan\anaconda3\python.exe -m pip install --upgrade pip' command.

Connect to the database

Let us first load the SQL extension and establish a connection with the database

%load_ext sql

DB2 magic in case of new UI service credentials.

  • Use the following format.

  • Add security=SSL at the end

%sql ibm_db_sa://my-username:my-password@my-hostname:my-port/my-db-name?security=SSL

Using Service Credentials-2

%sql ibm_db_sa://kfm42587:nem9nALHt4oPFozR@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb?security=SSL

Tasks

Now write and execute SQL queries to solve the assignment tasks.

Task 1

Display the names of the unique launch sites in the space mission
%sql SELECT UNIQUE(LAUNCH_SITE) FROM SPACEXTBL;
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 2

Display 5 records where launch sites begin with the string 'CCA'
%sql SELECT LAUNCH_SITE FROM SPACEXTBL WHERE LAUNCH_SITE LIKE 'CCA%' LIMIT 5;
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 3

Display the total payload mass carried by boosters launched by NASA (CRS)
%sql SELECT SUM(PAYLOAD_MASS__KG_) AS TOTAL_PAYLOAD_MASS FROM SPACEXTBL \ WHERE CUSTOMER = 'NASA (CRS)';
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 4

Display average payload mass carried by booster version F9 v1.1
%sql SELECT AVG(PAYLOAD_MASS__KG_) AS AVERAGE_PAYLOAD_MASS FROM SPACEXTBL \ WHERE BOOSTER_VERSION = 'F9 v1.1';
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 5

List the date when the first successful landing outcome in ground pad was acheived.

Hint:Use min function

%sql SELECT MIN(DATE) AS FIRST_SUCCESSFUL_GROUND_LANDING FROM SPACEXTBL \ WHERE LANDING__OUTCOME = 'Success (ground pad)';
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 6

List the names of the boosters which have success in drone ship and have payload mass greater than 4000 but less than 6000
%sql SELECT BOOSTER_VERSION FROM SPACEXTBL \ WHERE (LANDING__OUTCOME = 'Success (drone ship)') AND (PAYLOAD_MASS__KG_ BETWEEN 4000 AND 6000);
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 7

List the total number of successful and failure mission outcomes
%sql SELECT MISSION_OUTCOME, COUNT(MISSION_OUTCOME) AS TOTAL_NUMBER FROM SPACEXTBL GROUP BY MISSION_OUTCOME;
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 8

List the names of the booster_versions which have carried the maximum payload mass. Use a subquery
%sql SELECT DISTINCT(BOOSTER_VERSION) FROM SPACEXTBL \ WHERE PAYLOAD_MASS__KG_ = (SELECT MAX(PAYLOAD_MASS__KG_) FROM SPACEXTBL);
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 9

List the failed landing_outcomes in drone ship, their booster versions, and launch site names for in year 2015
%sql SELECT BOOSTER_VERSION, LAUNCH_SITE FROM SPACEXTBL \ WHERE (LANDING__OUTCOME = 'Failure (drone ship)') AND (EXTRACT(YEAR FROM DATE) = '2015');
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Task 10

Rank the count of landing outcomes (such as Failure (drone ship) or Success (ground pad)) between the date 2010-06-04 and 2017-03-20, in descending order
%sql SELECT LANDING__OUTCOME, COUNT(LANDING__OUTCOME) AS TOTAL_NUMBER FROM SPACEXTBL \ WHERE DATE BETWEEN '2010-06-04' AND '2017-03-20' \ GROUP BY LANDING__OUTCOME \ ORDER BY TOTAL_NUMBER DESC;
* ibm_db_sa://kfm42587:***@3883e7e4-18f5-4afe-be8c-fa31c41761d2.bs2io90l08kqb1od8lcg.databases.appdomain.cloud:31498/bludb Done.

Author(s)

Lakshmi Holla

Other Contributors

Rav Ahuja

Change log

DateVersionChanged byChange Description
2021-10-120.4Lakshmi HollaChanged markdown
2021-08-240.3Lakshmi HollaAdded library update
2021-07-090.2Lakshmi HollaChanges made in magic sql
2021-05-200.1Lakshmi HollaCreated Initial Version

© IBM Corporation 2021. All rights reserved.