Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Path: blob/main/10. Applied Data Science Capstone/03. Exploratory Data Analysis/03. Exploratory Data Analysis - SQL.ipynb
Views: 4598
Assignment: SQL Notebook for Peer Assignment
Estimated time needed: 60 minutes.
Introduction
Using this Python notebook you will:
Understand the Spacex DataSet
Load the dataset into the corresponding table in a Db2 database
Execute SQL queries to answer assignment questions
Overview of the DataSet
SpaceX has gained worldwide attention for a series of historic milestones.
It is the only private company ever to return a spacecraft from low-earth orbit, which it first accomplished in December 2010. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars wheras other providers cost upward of 165 million dollars each, much of the savings is because Space X can reuse the first stage.
Therefore if we can determine if the first stage will land, we can determine the cost of a launch.
This information can be used if an alternate company wants to bid against SpaceX for a rocket launch.
This dataset includes a record for each payload carried during a SpaceX mission into outer space.
Download the datasets
This assignment requires you to load the spacex dataset.
In many cases the dataset to be analyzed is available as a .CSV (comma separated values) file, perhaps on the internet. Click on the link below to download and save the dataset (.CSV file):
Store the dataset in database table
it is highly recommended to manually load the table using the database console LOAD tool in DB2.
Now open the Db2 console, open the LOAD tool, Select / Drag the .CSV file for the dataset, Next create a New Table, and then follow the steps on-screen instructions to load the data. Name the new table as follows:
SPACEXDATASET
Follow these steps while using old DB2 UI which is having Open Console Screen
Note:While loading Spacex dataset, ensure that detect datatypes is disabled. Later click on the pencil icon(edit option).
Change the Date Format by manually typing DD-MM-YYYY and timestamp format as DD-MM-YYYY HH:MM:SS.
Here you should place the cursor at Date field and manually type as DD-MM-YYYY.
Change the PAYLOAD_MASS__KG_ datatype to INTEGER.
Changes to be considered when having DB2 instance with the new UI having Go to UI screen
Refer to this insruction in this link for viewing the new Go to UI screen.
Later click on Data link(below SQL) in the Go to UI screen and click on Load Data tab.
Later browse for the downloaded spacex file.
Once done select the schema andload the file.
Connect to the database
Let us first load the SQL extension and establish a connection with the database
DB2 magic in case of new UI service credentials.
Use the following format.
Add security=SSL at the end
%sql ibm_db_sa://my-username:my-password@my-hostname:my-port/my-db-name?security=SSL
Using Service Credentials-2
Tasks
Now write and execute SQL queries to solve the assignment tasks.
Task 1
Display the names of the unique launch sites in the space mission
Task 2
Display 5 records where launch sites begin with the string 'CCA'
Task 3
Display the total payload mass carried by boosters launched by NASA (CRS)
Task 4
Display average payload mass carried by booster version F9 v1.1
Task 5
List the date when the first successful landing outcome in ground pad was acheived.
Hint:Use min function
Task 6
List the names of the boosters which have success in drone ship and have payload mass greater than 4000 but less than 6000
Task 7
List the total number of successful and failure mission outcomes
Task 8
List the names of the booster_versions which have carried the maximum payload mass. Use a subquery
Task 9
List the failed landing_outcomes in drone ship, their booster versions, and launch site names for in year 2015
Task 10
Rank the count of landing outcomes (such as Failure (drone ship) or Success (ground pad)) between the date 2010-06-04 and 2017-03-20, in descending order
Author(s)
Lakshmi Holla
Other Contributors
Rav Ahuja
Change log
Date | Version | Changed by | Change Description |
---|---|---|---|
2021-10-12 | 0.4 | Lakshmi Holla | Changed markdown |
2021-08-24 | 0.3 | Lakshmi Holla | Added library update |
2021-07-09 | 0.2 | Lakshmi Holla | Changes made in magic sql |
2021-05-20 | 0.1 | Lakshmi Holla | Created Initial Version |