Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
YStrano
GitHub Repository: YStrano/DataScience_GA
Path: blob/master/lessons/lesson_12/python-notebooks-data-wrangling/Wrangling -- Intro to pandas.DataFrames.ipynb
1904 views
Kernel: Python 3
# same ol boilerplate, except no csv.DictReader from os import makedirs from os.path import join %matplotlib inline import matplotlib.pyplot as plt # but now we got some pandas import pandas as pd DATA_FILENAME = join('data', 'climate', 'extracted', 'nasa-ghgases-co2-mean.csv')
df = pd.read_csv(DATA_FILENAME)
df.head(10)

Subsetting columns with the iloc attribute

iloc provides a way to slice a range of rows of a dataframe. Note the indexes are preserved:

df.iloc[5:10]

Select columns

Just pass in a list of column names

df[['co2_ppm_mean', 'year']].head()

Adding a derived column

Add a decade column

df['decade'] = df.round({'year': -1})['year'] df.head()

Add change over year for the co2_ppm_mean

Grouping