Automatic Differentiation

Automatic Differentiation is the foundation of how deep learning frameworks and machine learning algorithms in general work. Pytorch implements automatic differentiation through it's autograd module which computes gradients for all values in a tensor. Let's demonstrate this with an example.

Suppose we have a function:

$$ f(x) = 2x^2 + 3x $$

We know that the first order derivative is:

$$ f'(x) = 4x + 3 $$

If we set $x=2$, then the derivative at $x=2$ is

$$ 4(2) + 3 = 11 $$

Now let's see this in pytorch.

import torch

# setting mps device
device = torch.device('mps')

def quadratic(x):
    """ Creating the function """
    return 2 * x * x + 3 * x 

x = torch.tensor([2.0], requires_grad=True, device=device)

# applying the function to x
y = quadratic(x)

# computing the gradient
y.backward()

# accessing the derivative of x
x.grad
tensor([11.], device='mps:0')

We can now see the the x tensor has the grad value at 11 at $x=2$.

Computing Derivative for Multiple Values of X

The example above only returns the derivative when $x=2$. However, we may wish to plug in different values to return their derivatives. To do this using the above code, we encounter an error because backward() is set by default to compute for a tensor containing a single element. We can adjust this parameter by calling the sum on the $y$ variable then calling the backward().

See the demonstration below:

x = torch.tensor([2.0, 3.0, 2.1], requires_grad=True, device=device)

# applying the function to x
y = quadratic(x)

# # computing the gradient
y.sum().backward()

# # accessing the derivative of x
x.grad
tensor([11.0000, 15.0000, 11.4000], device='mps:0')