Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
Download

GEP475GROUPINEEDANAP

3895 views
Kernel: Python 3 (Anaconda)

Creating New "CO2 only" csv files, from orginal Netatmo csvs

  1. making sure both files have same index and same column name for CO2

import numpy as np import pandas as pd
year1 = pd.read_csv('../DataFiles/Raw/NetAtmo_2016.csv', parse_dates=True, index_col=1) year1.head()
year1.describe()
year2 = pd.read_csv('NetAtmo_2017.csv', parse_dates=True, index_col=1) year2.head()

above you can see, the 2016 csv (year1) did not have the same index as the 2017 file.

How to change:
year1.index.names = ['Time'] year1.head()

Now that we have the same index, we'll go ahead and create the new files.

  • isolating the CO2 data

firstyear = year1['CO2'] #firstyear.head()
secondyear = year2['CO2'] #secondyear.head()

using to_csv

firstyear.to_csv('Netatmo2016CO2ONLY.csv')
column_names = ['Time','ppm'] year1CO2 = pd.read_csv('Netatmo2016CO2ONLY.csv', parse_dates=True, index_col=0, names=column_names) year1CO2.head()
#year1CO2.describe()
secondyear.to_csv('Netatmo2017CO2ONLY.csv')
column_names = ['Time','ppm'] year2CO2= pd.read_csv('Netatmo2017CO2ONLY.csv', parse_dates=True, index_col=0, names=column_names) year2CO2.head()

Next, I want to create another csv file that has the column names already inside. (I dont know a better way)*

year1CO2.to_csv('2016CO2ONLY.csv') year1 = pd.read_csv('2016CO2ONLY.csv', parse_dates=True, index_col=0,) #year1.head()
year2CO2.to_csv('2017CO2ONLY.csv') year2 = pd.read_csv('2017CO2ONLY.csv', parse_dates=True, index_col=0,) year2.head()

Now we are left with the last two files, 2016 and 2017 CO2ONLY, and need to combine them

  • Using Concatenate

df1 = pd.read_csv('2016CO2ONLY.csv', parse_dates=True) #df1.head() df2 = pd.read_csv('2017CO2ONLY.csv', parse_dates=True) #df2.head()
combined = pd.concat([df1,df2])

These cells are just to convince myself the data was combined correctly

combined.describe()
combined.head()
#df1.describe()
#df2.describe()
#df1.head()
#combined.head()
#df2.tail()
#combined.tail()

Now, finally, we make the last csv file. It will include all the CO2 data from years 2016 and 2017

combined.to_csv('Combinedwithcolumnnanes.csv')
netatmodata = pd.read_csv('Combinedwithcolumnnanes.csv', parse_dates=True, index_col=1) netatmodata.head()

here, I'd like to make another csv without the extra columb.. HELP SOTO

both = netatmodata['ppm']
both.head()
Time 2016-02-19 13:26:00 NaN 2016-02-19 13:27:00 718.0 2016-02-19 13:27:00 NaN 2016-02-19 13:31:00 337.0 2016-02-19 13:36:00 332.0 Name: ppm, dtype: float64
both.to_csv('Netatmo2016_2017CO2ppm.csv')
netatmodata = pd.read_csv('Netatmo2016_2017CO2ppm.csv', parse_dates=True, index_col=0, ) netatmodata.head()