Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
YStrano
GitHub Repository: YStrano/DataScience_GA
Path: blob/master/lessons/lesson_17/Oil_residuals.ipynb
1904 views
Kernel: Python 3

Analyzing Oil Markets using Residuals

Today we will be analyzing oil markets using the us dollar, industrial production (as a proxy for demand) and the vix

%matplotlib inline import matplotlib.pyplot as plt import pandas as pd import numpy as np
ip = pd.read_csv('data/oil/ip_oecd.csv') #industrial production is the predcitor of oil demand vix = pd.read_csv('data/oil/vix.csv') #financial indicator for uncertainty or fear on the market oil_usd = pd.read_csv('data/oil/oil_usd.csv')

Notice I have different frequencies throughout these datasets

## quarterly ip.head()
## monthly oil_usd.head()
## Daily vix.head()

Define a little function to convert the "DATE" column to datetime, resample to quarterly, and make it the index

def make_date_index(df): df['DATE'] = pd.to_datetime(df['DATE']) df.set_index('DATE', inplace=True) df = df.resample('Q').mean() return df oil_usd = make_date_index(oil_usd) vix = make_date_index(vix) ip = make_date_index(ip)
ip.tail()
## Merge into our final data frame df = oil_usd.join(ip) df = df.join(vix) df.head()
df.tail()
df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x10983d048>
Image in a Jupyter notebook

issue with scale, to understand the data in graph

## Lets drop nulls df.dropna(inplace=True)
## Lets take logs df = df.apply(lambda x: np.log(x), axis=1)
df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x10f1c1eb8>
Image in a Jupyter notebook
## Lets take differences (now we are in log differences!) df = df.apply(lambda x: x.diff()) df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x10f2c40f0>
Image in a Jupyter notebook
df.corr()
import statsmodels.formula.api as smf lm = smf.ols('oil ~ usd + ip_oecd + vix', data= df).fit() lm.summary()

Lets look at the Residuals

residuals = lm.resid
residuals.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x1a1a4b1fd0>
Image in a Jupyter notebook
from statsmodels.graphics.tsaplots import plot_acf plot_acf(residuals)
/anaconda3/lib/python3.6/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead. from pandas.core import datetools
Image in a Jupyter notebookImage in a Jupyter notebook

Key events in the oil market

  • March 1999: OPEC Production Cut

  • 2010-2011: Arab Spring/ Libyan civil war

  • 2015 - Present: reduction in oil investment due to shale boom

residuals.rolling(4).mean().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x1c1a94e5c0>
Image in a Jupyter notebook

this shows that log diff is very close to % change

np.log(100)-np.log(110)
-0.09531017980432477
(np.log(100)-np.log(110)) == (np.log(110)-np.log(100))
False
print((np.log(100)-np.log(110)), (np.log(110)-np.log(100)))
-0.09531017980432477 0.09531017980432477