Locally weighted regression

(x(i),y(i)) is the training example i, x(i)Rn+1, y(i)R

m is the number of examples, n is the number of features

Hypothesis function

hθ(x)=j=0nθjxj=θTx

Loss function

J(θ)=12i=1m(hθ(x(i))y(i))2

Parametric learning algorithm finds a fixed set of parameters θ. Non-parameteric learning algorithm requires to keep the training data set, which could be cumbersome for large data sets. Locally weighted linear regression is one example of non-parametric learning algorithms.

For linear regression, to evaluate h at x, we fit θ to minimize J(θ), and then return θTx.

For locally weighted regression, we look at a local neighborhood of x, focusing (putting more weight) on a narrow range near x, we fit a straight line, and make a prediction at the value of x.

Fit θ to minimize

J(θ)=i=1mw(i)(hθ(x(i))y(i))2

where the weight function is defined as

w(i)=exp((x(i)x)22)