Maximum likelihood

Maximum likelihood estimation is a general method for estimating the parameters of econometric models from observed data. The following conditions should be met for the maximum likelihood principle to work: 1. The form of the joint pdf of $y_t$ is known. 2. The specifications of the moments of the joint pdf are known. 3. The joint pdf can be evaluated for all values of the parameters $\theta$.

If the distribution $y_t$ is misspecified, i.e. violating conditions (1) and (2), estimation is by quasi-maximum likelihood. If condition (1) is violated, a generalized method of moments is required. If condition (2) is not satisfied, estimation relies on nonparametric methods. If condition (3) is not met, simulation-based estimation methods are used.

A time series data represents the observed realization of draws from a joint pdf. The maximum likelihood principle makes used of this result by providing a general framework for estimating the unknown parameters $\theta$ from the observed time series data $\{y_1, y_2, \dots, y_T\}$.

The standard interpretation of the joint pdf is that $f$ is a function of $y_t$ for given parameters $\theta$. When defining the maximum likelihood estimator, this interpretation is reversed, so that $f$ is taken as a function of $\theta$ for given $y_t$, because we regard $\{y_1, y_2, \dots, y_T\}$ as a realized data set which is no longer random. The maximum likelihood estimator is then obtained by finding the value of $\theta$ which is "most likely" to have generated the observed data.

The likelihood function is simply a redefinition of the joint pdf. For many problems it is easier to work with the logarithm of this joint pdf. The log-likelihood function is

$$\begin{align} \ln L_T(\theta) &= \frac1T \ln f(y_1|x_1;\theta)\\[2ex] &+ \frac 1T \sum_{t=2}^T \ln f(y_t|y_{t-1}, \dots, y_1, x_t, x_{t-1}, \dots, x_1; \theta) \end{align} \tag{1.8}$$

where $\theta$ is a single argument and $T$ indicates that the log-likelihood is an average over the sample of the logarithm of the density evaluated at $y_t$. The term log-likelihood is also known as the average log-likelihood.

When $y_t$ is iid, the log-likelihood function is based on the joint pdf in (1.4):

$$\begin{align} \ln L_T(\theta) &= \frac 1T \sum_{t=1}^T \ln f(y_t; \theta) \end{align}$$

In all cases, the log-likelihood function is a scalar that represents a summary measure of the data for given $\theta$.

The maximum likelihood estimator of $\theta$ is defined as the value $\hat{\theta}$ that maximizes the log-likelihood function.