Gradient

In vector calculus, the gradient of a scalar-valued differentiable function $f$ of several variables is the vector field $\nabla f$ (we say "del f") whose value at a point $p$ is the vector whose components are the partial derivatives of $f$ at $p$.

$$ \nabla f(p) = \begin{bmatrix} \frac {\partial f} {\partial x_1}(p) \\ \vdots \\ \frac {\partial f} {\partial x_n}(p) \end{bmatrix} $$

Essentially, the gradient is a vector-valued function: it takes a vector and outputs a vector. For example, if we have a two-variable function $f(x, y)$, the gradient could be derived as following:

$$f(x, y) = x^2\sin y$$

$$\frac {\partial f} {\partial x} = 2x \sin y$$

$$\frac {\partial f} {\partial y} = x^2 \cos y$$

$$ \nabla f(x, y) = \begin{bmatrix} 2x \sin y \\ x^2 \cos y \end{bmatrix} $$

Geometrically, the gradient points in the direction of steepest ascent. The length of the gradient vector tells us the rate of change in the direction of steepest ascent.

Gradient 3D

In terms of vectors, let's say we have a vector $\vec w$:

$$ \vec w = \begin{bmatrix} a\\ b \end{bmatrix} $$

and a function $f$:

$$f(x, y) = x^2 y$$

We want to find how a small change in this vector $\vec w$ changes the output of the function $f$, or in other words, the directional derivative of $f$ in the direction of the vector $\vec w$:

$$ \nabla_{\vec w} f = a \frac {\partial f} {\partial x} + b \frac {\partial f} {\partial y} = $$

$$ \begin{bmatrix} a\\ b \end{bmatrix} \cdot \begin{bmatrix} {\partial f} / {\partial x}\\ {\partial f} / {\partial y} \end{bmatrix}=\\ \vec w \cdot \nabla f $$

The formal definition of the partial derivative of a two variable function $f$ with respect to $x$ is:

$$ \frac {\partial f} {\partial x} (a, b) = \lim_{h \to 0} \frac{f(a + h, b) - f(a, b)}{h} $$

$$ \frac {\partial f} {\partial x} (\vec a) = \lim_{h \to 0} \frac{f(\vec a + h \vec i) - f(\vec a)}{h} $$

$$ \frac {\partial f} {\partial \vec a}= \nabla_{\vec v} f(\vec a) = \lim_{h \to 0} \frac{f(\vec a + h \vec v) - f(\vec a)}{h}= \nabla f \cdot \vec v $$

where $\| \vec v \| = 1$. If $\vec v$ is not a unit vector, we need to account for that:

$$ \nabla_{\vec v} f(\vec a) = \lim_{h \to 0} \frac{f(\vec a + h \vec v) - f(\vec a)}{h \cdot \| \vec v \|}= \frac {\nabla f \cdot \vec v} {\| \vec v \|} $$