Path: blob/master/notebooks/book1/08/autodiff_pytorch.ipynb
1192 views
Automatic differentation using PyTorch
We show how to do Automatic differentation using PyTorch.
Example: binary logistic regression
Objective = NLL for binary logistic regression
Computing gradients by hand
PyTorch code
To compute the gradient using torch, we proceed as follows.
declare all the variables that you want to take derivatives with respect to using the requires_grad=True argumnet
define the (scalar output) objective function you want to differentiate in terms of these variables, and evaluate it at a point. This will generate a computation graph and store all the tensors.
call objective.backward() to trigger backpropagation (chain rule) on this graph.
extract the gradients from each variable using variable.grad field. (These will be torch tensors.)
See the example below.
Autograd on a DNN
Below we show how to define more complex deep neural networks, and how to access their parameters. We can then call backward() on the scalar loss function, and extract their gradients. We base our presentation on http://d2l.ai/chapter_deep-learning-computation/parameters.html.
Sequential models
First we create a shallow MLP.
Let's visualize the model and all the parameters in each layer.
Access a specific parameter.
The gradient is not defined until we call backward.
Nested models
Let us access the 0 element of the top level sequence, which is block 0-3. Then we access element 1 of this, which is block 1. Then we access element 0 of this, which is the first linear layer.
Backprop
Tied parameters
Sometimes parameters are reused in multiple layers, as we show below. In this case, the gradients are added.
Other material
To compute gradient of a function that does not return a scalar (eg the gradient of each output wrt each input), you can do the following.