Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Orthogonal Projections






The Closest Point in a Subspace

The orthogonal projection of a vector onto a subspace is the point in the subspace closest to the original vector. The residual — the difference between the vector and its projection — is perpendicular to the subspace. This orthogonal decomposition is the geometric engine behind least squares, the QR decomposition, and every approximation problem in linear algebra.



Projection onto a Vector

The orthogonal projection of b\mathbf{b} onto a nonzero vector a\mathbf{a} is the point on the line through a\mathbf{a} nearest to b\mathbf{b}:

projab=abaaa\text{proj}_{\mathbf{a}}\mathbf{b} = \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{a} \cdot \mathbf{a}}\,\mathbf{a}


The scalar c^=abaa\hat{c} = \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{a} \cdot \mathbf{a}} is the component of b\mathbf{b} in the direction of a\mathbf{a}. The projection c^a\hat{c}\,\mathbf{a} lies on the line through a\mathbf{a}, and the residual bc^a\mathbf{b} - \hat{c}\,\mathbf{a} is orthogonal to a\mathbf{a}:

(bc^a)a=bac^(aa)=baba=0(\mathbf{b} - \hat{c}\,\mathbf{a}) \cdot \mathbf{a} = \mathbf{b} \cdot \mathbf{a} - \hat{c}(\mathbf{a} \cdot \mathbf{a}) = \mathbf{b} \cdot \mathbf{a} - \mathbf{b} \cdot \mathbf{a} = 0


Worked Example


Project b=(3,4,0)\mathbf{b} = (3, 4, 0) onto a=(1,1,1)\mathbf{a} = (1, 1, 1):

c^=3+4+01+1+1=73,projab=73(1,1,1)=(73,73,73)\hat{c} = \frac{3 + 4 + 0}{1 + 1 + 1} = \frac{7}{3}, \quad \text{proj}_{\mathbf{a}}\mathbf{b} = \frac{7}{3}(1, 1, 1) = \left(\frac{7}{3}, \frac{7}{3}, \frac{7}{3}\right)


Residual: bprojab=(23,53,73)\mathbf{b} - \text{proj}_{\mathbf{a}}\mathbf{b} = (\frac{2}{3}, \frac{5}{3}, -\frac{7}{3}). Check: (23)(1)+(53)(1)+(73)(1)=0(\frac{2}{3})(1) + (\frac{5}{3})(1) + (-\frac{7}{3})(1) = 0.

The Orthogonal Decomposition

Every vector bRn\mathbf{b} \in \mathbb{R}^n decomposes uniquely with respect to a subspace WW as

b=b^+z\mathbf{b} = \hat{\mathbf{b}} + \mathbf{z}


where b^W\hat{\mathbf{b}} \in W and zW\mathbf{z} \in W^\perp. The component b^\hat{\mathbf{b}} is the orthogonal projection of b\mathbf{b} onto WW, and z=bb^\mathbf{z} = \mathbf{b} - \hat{\mathbf{b}} is the perpendicular residual.

The projection b^\hat{\mathbf{b}} is the closest point in WW to b\mathbf{b}. For any other vector wW\mathbf{w} \in W:

bw2=z2+b^w2z2=bb^2\|\mathbf{b} - \mathbf{w}\|^2 = \|\mathbf{z}\|^2 + \|\hat{\mathbf{b}} - \mathbf{w}\|^2 \geq \|\mathbf{z}\|^2 = \|\mathbf{b} - \hat{\mathbf{b}}\|^2


The inequality follows from the Pythagorean theorem: z\mathbf{z} is orthogonal to b^w\hat{\mathbf{b}} - \mathbf{w} (both b^\hat{\mathbf{b}} and w\mathbf{w} are in WW, so their difference is in WW, and zW\mathbf{z} \in W^\perp). The minimum distance z\|\mathbf{z}\| is achieved uniquely at w=b^\mathbf{w} = \hat{\mathbf{b}}.

Projection with an Orthogonal Basis

When W=Span{u1,,uk}W = \text{Span}\{\mathbf{u}_1, \dots, \mathbf{u}_k\} and the basis {u1,,uk}\{\mathbf{u}_1, \dots, \mathbf{u}_k\} is orthogonal, the projection of b\mathbf{b} onto WW decomposes into independent vector projections:

projWb=u1bu1u1u1+u2bu2u2u2++ukbukukuk\text{proj}_W \mathbf{b} = \frac{\mathbf{u}_1 \cdot \mathbf{b}}{\mathbf{u}_1 \cdot \mathbf{u}_1}\,\mathbf{u}_1 + \frac{\mathbf{u}_2 \cdot \mathbf{b}}{\mathbf{u}_2 \cdot \mathbf{u}_2}\,\mathbf{u}_2 + \cdots + \frac{\mathbf{u}_k \cdot \mathbf{b}}{\mathbf{u}_k \cdot \mathbf{u}_k}\,\mathbf{u}_k


Each term is the projection of b\mathbf{b} onto one basis vector. Orthogonality prevents interference: projecting onto u1\mathbf{u}_1 does not affect the component along u2\mathbf{u}_2, because u1u2=0\mathbf{u}_1 \cdot \mathbf{u}_2 = 0.

When the basis is orthonormal, the denominators are all 11:

projWb=(q1b)q1+(q2b)q2++(qkb)qk\text{proj}_W \mathbf{b} = (\mathbf{q}_1 \cdot \mathbf{b})\,\mathbf{q}_1 + (\mathbf{q}_2 \cdot \mathbf{b})\,\mathbf{q}_2 + \cdots + (\mathbf{q}_k \cdot \mathbf{b})\,\mathbf{q}_k


This is the cleanest formula in all of linear algebra — kk dot products and kk scalar multiplications.

Projection with an Arbitrary Basis

When the basis for WW is not orthogonal, the individual vector projection formula does not apply — projecting onto one basis vector interferes with the others. Instead, the projection requires solving a system.

If the columns of the m×km \times k matrix AA form a basis for WW, the projection of b\mathbf{b} onto WW is

b^=A(ATA)1ATb\hat{\mathbf{b}} = A(A^TA)^{-1}A^T\mathbf{b}


This formula requires ATAA^TA to be invertible, which holds whenever the columns of AA are linearly independent.

The derivation comes from the orthogonality condition. The residual bAx^\mathbf{b} - A\hat{\mathbf{x}} must be perpendicular to every column of AA: AT(bAx^)=0A^T(\mathbf{b} - A\hat{\mathbf{x}}) = \mathbf{0}. Solving for x^\hat{\mathbf{x}} gives ATAx^=ATbA^TA\hat{\mathbf{x}} = A^T\mathbf{b}, so x^=(ATA)1ATb\hat{\mathbf{x}} = (A^TA)^{-1}A^T\mathbf{b}, and b^=Ax^=A(ATA)1ATb\hat{\mathbf{b}} = A\hat{\mathbf{x}} = A(A^TA)^{-1}A^T\mathbf{b}.

The alternative is to first orthogonalize the basis using Gram-Schmidt, then use the simpler orthogonal formula. Both approaches give the same projection.

The Projection Matrix

The matrix P=A(ATA)1ATP = A(A^TA)^{-1}A^T maps any vector b\mathbf{b} to its projection onto Col(A)\text{Col}(A): b^=Pb\hat{\mathbf{b}} = P\mathbf{b}.

When the basis is orthonormal (A=QA = Q with QTQ=IQ^TQ = I), the formula simplifies to P=QQTP = QQ^T.

The projection matrix satisfies two algebraic conditions. It is symmetric: PT=PP^T = P. And it is idempotent: P2=PP^2 = P — projecting twice gives the same result as projecting once, because vectors already in WW are fixed by PP.

The complementary matrix IPI - P projects onto WW^\perp. It satisfies (IP)T=IP(I - P)^T = I - P and (IP)2=IP(I - P)^2 = I - P, and for every b\mathbf{b}: Pb+(IP)b=bP\mathbf{b} + (I - P)\mathbf{b} = \mathbf{b}, decomposing b\mathbf{b} into its WW-component and its WW^\perp-component.

The eigenvalues of PP are 00 and 11: vectors in WW map to themselves (eigenvalue 11) and vectors in WW^\perp map to zero (eigenvalue 00). The rank of PP equals the trace of PP, which equals dim(W)\dim(W).

Properties of Orthogonal Projections

Orthogonal projections are characterized by two properties acting together.

Idempotence (P2=PP^2 = P): once a vector has been projected, projecting again changes nothing. Every vector in WW is a fixed point of PP. This distinguishes projections from other linear transformations — most transformations continue to change vectors on repeated application.

Symmetry (PT=PP^T = P): the projection is self-adjoint with respect to the dot product. This means Puv=uPvP\mathbf{u} \cdot \mathbf{v} = \mathbf{u} \cdot P\mathbf{v} for all u,v\mathbf{u}, \mathbf{v}. The symmetry condition is what makes the projection orthogonal rather than oblique — it ensures the residual is perpendicular to WW, not merely non-parallel.

A matrix satisfying P2=PP^2 = P and PT=PP^T = P is an orthogonal projection. A matrix satisfying P2=PP^2 = P but PTPP^T \neq P is an oblique projection — it projects onto the same subspace but along a different direction, not the perpendicular one.

The error bPb\|\mathbf{b} - P\mathbf{b}\| is the distance from b\mathbf{b} to WW. It is the smallest possible value of bw\|\mathbf{b} - \mathbf{w}\| over all wW\mathbf{w} \in W.

Projection and Least Squares

When the system Ax=bA\mathbf{x} = \mathbf{b} has no solution — when b\mathbf{b} is not in the column space of AA — the least-squares solution x^\hat{\mathbf{x}} produces the projection of b\mathbf{b} onto Col(A)\text{Col}(A):

Ax^=b^=PbA\hat{\mathbf{x}} = \hat{\mathbf{b}} = P\mathbf{b}


The least-squares solution does not solve Ax=bA\mathbf{x} = \mathbf{b}. It solves Ax=b^A\mathbf{x} = \hat{\mathbf{b}}, where b^\hat{\mathbf{b}} is the closest reachable vector to b\mathbf{b}.

The residual r=bAx^\mathbf{r} = \mathbf{b} - A\hat{\mathbf{x}} lies in Col(A)=Null(AT)\text{Col}(A)^\perp = \text{Null}(A^T) — it is orthogonal to every column of AA. The condition ATr=0A^T\mathbf{r} = \mathbf{0} is exactly the normal equation ATAx^=ATbA^TA\hat{\mathbf{x}} = A^T\mathbf{b}.

Every least-squares problem is a projection problem. Solving least squares means projecting the target b\mathbf{b} onto the column space and finding the input x^\hat{\mathbf{x}} that produces the projected output.

Worked Example: Full Projection Computation

Project b=(1,2,3)\mathbf{b} = (1, 2, 3) onto the subspace W=Span{(1,0,1),(0,1,1)}W = \text{Span}\{(1, 0, 1), (0, 1, 1)\} in R3\mathbb{R}^3.

The basis is not orthogonal: (1,0,1)(0,1,1)=0+0+1=10(1, 0, 1) \cdot (0, 1, 1) = 0 + 0 + 1 = 1 \neq 0. Use the general formula. Set A=(100111)A = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 1 & 1 \end{pmatrix}.

ATA=(101011)(100111)=(2112)A^TA = \begin{pmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 1 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}


(ATA)1=13(2112)(A^TA)^{-1} = \frac{1}{3}\begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix}


ATb=(1+0+30+2+3)=(45)A^T\mathbf{b} = \begin{pmatrix} 1 + 0 + 3 \\ 0 + 2 + 3 \end{pmatrix} = \begin{pmatrix} 4 \\ 5 \end{pmatrix}


x^=13(2112)(45)=13(36)=(12)\hat{\mathbf{x}} = \frac{1}{3}\begin{pmatrix} 2 & -1 \\ -1 & 2 \end{pmatrix}\begin{pmatrix} 4 \\ 5 \end{pmatrix} = \frac{1}{3}\begin{pmatrix} 3 \\ 6 \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \end{pmatrix}


b^=Ax^=1(101)+2(011)=(123)\hat{\mathbf{b}} = A\hat{\mathbf{x}} = 1\begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix} + 2\begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \\ 3 \end{pmatrix}


The projection equals b\mathbf{b} itself — meaning b\mathbf{b} was already in WW. The residual is 0\mathbf{0}, confirming bSpan{(1,0,1),(0,1,1)}\mathbf{b} \in \text{Span}\{(1, 0, 1), (0, 1, 1)\}. Indeed: (1,2,3)=1(1,0,1)+2(0,1,1)(1, 2, 3) = 1 \cdot (1, 0, 1) + 2 \cdot (0, 1, 1).