# Linear least squares

Linear least squares is a mathematical optimization technique to find an approximate solution for a system of linear equations that has no exact solution. This usually happens if the number of equations (m) is bigger than the number of variables (n). (See also linear regression.)

 Contents

## Definition

In mathematical terms, we want to find a solution for the "equation"

[itex]A\mathbf{x} \approx \mathbf{b}[itex],

where A is an m-by-n matrix (with m > n) and x and b are n- respectively m-dimensional column vectors. More precisely, we want to minimize the Euclidean norm squared of the residual Axb, that is, the quantity

[itex]\|A\mathbf{x}-\mathbf{b}\|^2 = \left([A\mathbf{x}]_1-\mathbf{b}_1\right)^2

+\left([A\mathbf{x}]_2-\mathbf{b}_2\right)^2 +\dots+ \left([A\mathbf{x}]_n-\mathbf{b}_n\right)^2, [itex]

where [Ax]i denotes the i-th component of the vector Ax. Hence the name "least squares".

It turns out that a vector x that minimizes this expression also solves the normal equation

[itex] A^T \! A \mathbf{x} = A^T \mathbf{b}, [itex]

where AT stands for the transpose of A. Note that this corresponds to a system of linear equations. The matrix ATA on the left-hand side is a square matrix, which is invertible if A has full column rank (that is, if the rank of A is n). In that case, the solution of the system of linear equations is unique and given by

[itex] \mathbf{x} = (A^T\!A)^{-1} A^T \mathbf{b}. [itex]

The matrix [itex](A^TA)^{-1}A^T[itex] is called the pseudo inverse of A. We cannot use the true matrix inverse of A (that is, [itex]A^{-1}[itex]), because it does not exist as A is not a square matrix (mn).

## Computation

The normal equation can be solved like any other equation system, yet an efficient and numerically stable method can be obtained by first computing the QR decomposition of the matrix A.

Then, with A = QR, where Q is an orthogonal matrix and R is an upper triangular matrix, the normal equation simplifies to

Rx = QTb.

## Applications

The linear least squares method can be used to find an affine function RnR that best fits a given set of data (see general least squares method). (It is widely and erroneously thought that the word linear in the term linear regression refers to the linear or affine nature of the fitted function, but see that article.)

We write the linear function we try to find as a 1-by-n matrix xT (so x is actually a column vector, see also linear transformation).

The set of data consists of m (n+1)-tuples (x1, ..., xn, y). Those can be written into an m-by-n matrix A and a vector b, where every tuple corresponds to a row of A, the y becoming the corresponding entry in b.

Then,

Axb

yields the function x we seek.

## Example

Consider the points (0, 3), (2, 3), (4, 4), (−1, 2). We seek a solution of the form ax + b = y, that is,

[itex]\begin{pmatrix}x & 1 \end{pmatrix}\begin{pmatrix} a \\ b\end{pmatrix} = y[itex]

We can form then the matrix A:

[itex]A=\begin{pmatrix}

0 & 1 \\ 2 & 1 \\ 4 & 1 \\ -1 & 1 \\ \end{pmatrix}[itex]

[itex]A^T=\begin{pmatrix}

0 & 2 & 4 & -1 \\ 1 & 1 & 1 & 1 \end{pmatrix}[itex]

[itex]A^TA=\begin{pmatrix}

21 & 5 \\ 5 & 4 \end{pmatrix}[itex] and the vector b

[itex]\mathbf{b} = \begin{pmatrix}

3 \\ 3 \\ 4 \\ 2 \end{pmatrix}[itex] and then

[itex]A^T\mathbf{b}=\begin{pmatrix}

20 \\ 12 \end{pmatrix}[itex] So, the normal equation is

[itex]A^TA\begin{pmatrix} a \\ b\end{pmatrix} = A^T\mathbf{b}[itex]
[itex]\begin{pmatrix}

21 & 5 \\ 5 & 4 \end{pmatrix}.\begin{pmatrix} a \\ b\end{pmatrix}=\begin{pmatrix} 20 \\ 12 \end{pmatrix}[itex] Then,

[itex](A^TA)^{-1}={1\over 59}\begin{pmatrix}

4 & -5 \\ -5 & 21 \end{pmatrix}[itex] and

[itex]\begin{pmatrix} a \\ b\end{pmatrix}={1\over 59}\begin{pmatrix}

4 & -5 \\ -5 & 21 \end{pmatrix}\begin{pmatrix} 20 \\ 12 \end{pmatrix}=\begin{pmatrix} 20/59 \\ 152/59 \end{pmatrix}[itex] and the line of best fit is (20/59)x + 152/59 = y.

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy