Path: blob/master/8_sgd_vs_gd/gd_and_sgd.ipynb
1141 views
Kernel: Python 3
Implementation of stochastic and batch grandient descent in python
We will use very simple home prices data set to implement batch and stochastic gradient descent in python. Batch gradient descent uses all training samples in forward pass to calculate cumulitive error and than we adjust weights using derivaties. In stochastic GD, we randomly pick one training sample, perform forward pass, compute the error and immidiately adjust weights. So the key difference here is that to adjust weights batch GD will use all training samples where as stochastic GD will use one randomly picked training sample
In [430]:
Load the dataset in pandas dataframe
In [431]:
Out[431]:
Preprocessing/Scaling: Since our columns are on different sacle it is important to perform scaling on them
In [432]:
Out[432]:
array([[0.08827586, 0.25 ],
[0.62068966, 0.75 ],
[0.22068966, 0.5 ],
[0.24862069, 0.5 ],
[0.13793103, 0.25 ],
[0.12758621, 0.25 ],
[0.6662069 , 0.75 ],
[0.86206897, 0.75 ],
[0.17586207, 0.5 ],
[1. , 1. ],
[0.34482759, 0.5 ],
[0.68448276, 0.75 ],
[0.06896552, 0.25 ],
[0.10344828, 0.25 ],
[0.5 , 0.5 ],
[0.12931034, 0.25 ],
[0.13103448, 0.5 ],
[0.25517241, 0.5 ],
[0.67931034, 0.5 ],
[0. , 0. ]])
In [433]:
Out[433]:
array([[0.05237037],
[0.65185185],
[0.22222222],
[0.31851852],
[0.14074074],
[0.04444444],
[0.76296296],
[0.91111111],
[0.13333333],
[1. ],
[0.37037037],
[0.8 ],
[0.04444444],
[0.05925926],
[0.51111111],
[0.07407407],
[0.11851852],
[0.20740741],
[0.51851852],
[0. ]])
We should convert target column (i.e. price) into one dimensional array. It has become 2D due to scaling that we did above but now we should change to 1D
In [434]:
Out[434]:
array([0.05237037, 0.65185185, 0.22222222, 0.31851852, 0.14074074,
0.04444444, 0.76296296, 0.91111111, 0.13333333, 1. ,
0.37037037, 0.8 , 0.04444444, 0.05925926, 0.51111111,
0.07407407, 0.11851852, 0.20740741, 0.51851852, 0. ])
Gradient descent allows you to find weights (w1,w2,w3) and bias in following linear equation for housing price prediction
Now is the time to implement mini batch gradient descent.
In [443]:
Out[443]:
(array([0.70712464, 0.67456527]), -0.23034857438407427, 0.0068641890429808105)
Check price equation above. In that equation we were trying to find values of w1,w2 and bias. Here we got these values for each of them,
w1 = 0.66469087 w2 = 0.60541671 bias = -0.17792104056392882
Now plot epoch vs cost graph to see how cost reduces as number of epoch increases
In [436]:
Out[436]:
[<matplotlib.lines.Line2D at 0x1cefc80ab50>]
Lets do some predictions now.
In [437]:
Out[437]:
124.97561189905038
In [438]:
Out[438]:
34.60197457980031
In [439]:
Out[439]:
70.50604143757819
(2) Stochastic Gradient Descent Implementation
Stochastic GD will use randomly picked single training sample to calculate error and using this error we backpropage to adjust weights
In [440]:
Out[440]:
6
In [453]:
Out[453]:
(array([0.70486189, 0.67274269]), -0.2290363484141679, 0.00023657698876079387)
Compare this with weights and bias that we got using gradient descent. They both of quite similar.
In [454]:
Out[454]:
(array([0.70712464, 0.67456527]), -0.23034857438407427)
In [455]:
Out[455]:
[<matplotlib.lines.Line2D at 0x1cefc8cc9d0>]
In [456]:
Out[456]:
128.25785506303845
In [457]:
Out[457]:
30.347665843402435
In [459]:
Out[459]:
69.45899958796899