Execute the cell below to import the Pandas and NumPy modules using their familiar aliases.
Write a statement in the cell below that creates and displays a DataFrame
named stats
with the structure shown below.
points | assists | |
---|---|---|
Player | ||
Schoonveld | 5 | 2 |
Voskuil | 11 | 3 |
Muller | 18 | 5 |
Player is not a data row, but is the index's name.
Execute the cell below to load several data sets from the seaborn
package.
Write a statement in the cell below that lists the names and types of the columns in the DataFrame
named taxis
.
Write a statement in the cell below that displays the rows in taxis
with index values 1000 through 1005.
Write a statement in the cell below that displays the pickup_zone
, dropoff_zone
, and fare
for the 10th through 20th rows in taxis
.
Write a statement that displays the possible values in the payment
column in taxis
.
What percentage of trips were taken by one passenger? Write one or more statements in the cell below that display the answer to this question in the format below
Write a single statement to display the 2 most frequently occurring values in the pickup_borough
column along with the number of trips originating in those boroughs.
Determine if more trips were made whose total was $20 or less, or whose total was $50 or more.
Print either the string $20 or less or $50 or more.
Write a statement that loads the contents of a file named mpg.txt into a DataFrame
named mpg
. The | character is used to separate column within the file. The columns should be named:
mpg
cylinders
displacement
horsepower
weight
acceleration
model_year
origin
name
The displacement
and acceleration
columns should not be imported into mpg
. Display the first 10 rows in mpg
to verify the import worked correctly.
Write code to add a new column named weight_tons
to the DataFrame
named vehicles
. The values in the weight_tons
column should be the values in the weight
column divided by 2000.
Display the first 5 values in the weight_tons
column after adding the column.
This question uses the DataFrame
named taxis
. Write a statement in the cell below that creates and displays a DataFrame
named mean_by_borough
containing the average values for the fare
and distance
traveled for each value of the pickup_borough
column.
Determine the most common combination of pickup_borough
and dropoff_borough
for which pickup_borough
and dropoff_borough
are not the same.
For that combination only, determine the number of fares and the amount of revenue generated. Revenue is defined as the sum of the values in the total
column.
The output should be: