The orthogonal projection of a vector onto a subspace is the point in the subspace closest to the original vector. The residual — the difference between the vector and its projection — is perpendicular to the subspace. This orthogonal decomposition is the geometric engine behind least squares, the QR decomposition, and every approximation problem in linear algebra.
Projection onto a Vector
The orthogonal projection of b onto a nonzero vector a is the point on the line through a nearest to b:
projab=a⋅aa⋅ba
The scalar c^=a⋅aa⋅b is the component of b in the direction of a. The projection c^a lies on the line through a, and the residual b−c^a is orthogonal to a:
Every vector b∈Rn decomposes uniquely with respect to a subspaceW as
b=b^+z
where b^∈W and z∈W⊥. The component b^ is the orthogonal projection of b onto W, and z=b−b^ is the perpendicular residual.
The projection b^ is the closest point in W to b. For any other vector w∈W:
∥b−w∥2=∥z∥2+∥b^−w∥2≥∥z∥2=∥b−b^∥2
The inequality follows from the Pythagorean theorem: z is orthogonal to b^−w (both b^ and w are in W, so their difference is in W, and z∈W⊥). The minimum distance ∥z∥ is achieved uniquely at w=b^.
Projection with an Orthogonal Basis
When W=Span{u1,…,uk} and the basis{u1,…,uk} is orthogonal, the projection of b onto W decomposes into independent vector projections:
Each term is the projection of b onto one basis vector. Orthogonality prevents interference: projecting onto u1 does not affect the component along u2, because u1⋅u2=0.
When the basis is orthonormal, the denominators are all 1:
projWb=(q1⋅b)q1+(q2⋅b)q2+⋯+(qk⋅b)qk
This is the cleanest formula in all of linear algebra — k dot products and k scalar multiplications.
Projection with an Arbitrary Basis
When the basis for W is not orthogonal, the individual vector projection formula does not apply — projecting onto one basis vector interferes with the others. Instead, the projection requires solving a system.
If the columns of the m×kmatrixA form a basis for W, the projection of b onto W is
The derivation comes from the orthogonality condition. The residual b−Ax^ must be perpendicular to every column of A: AT(b−Ax^)=0. Solving for x^ gives ATAx^=ATb, so x^=(ATA)−1ATb, and b^=Ax^=A(ATA)−1ATb.
The alternative is to first orthogonalize the basis using Gram-Schmidt, then use the simpler orthogonal formula. Both approaches give the same projection.
The Projection Matrix
The matrix P=A(ATA)−1AT maps any vector b to its projection onto Col(A): b^=Pb.
When the basis is orthonormal (A=Q with QTQ=I), the formula simplifies to P=QQT.
The projection matrix satisfies two algebraic conditions. It is symmetric: PT=P. And it is idempotent: P2=P — projecting twice gives the same result as projecting once, because vectors already in W are fixed by P.
The complementary matrix I−P projects onto W⊥. It satisfies (I−P)T=I−P and (I−P)2=I−P, and for every b: Pb+(I−P)b=b, decomposing b into its W-component and its W⊥-component.
The eigenvalues of P are 0 and 1: vectors in W map to themselves (eigenvalue 1) and vectors in W⊥ map to zero (eigenvalue 0). The rank of P equals the trace of P, which equals dim(W).
Properties of Orthogonal Projections
Orthogonal projections are characterized by two properties acting together.
Idempotence (P2=P): once a vector has been projected, projecting again changes nothing. Every vector in W is a fixed point of P. This distinguishes projections from other linear transformations — most transformations continue to change vectors on repeated application.
Symmetry (PT=P): the projection is self-adjoint with respect to the dot product. This means Pu⋅v=u⋅Pv for all u,v. The symmetry condition is what makes the projection orthogonal rather than oblique — it ensures the residual is perpendicular to W, not merely non-parallel.
A matrix satisfying P2=P and PT=P is an orthogonal projection. A matrix satisfying P2=P but PT=P is an oblique projection — it projects onto the same subspace but along a different direction, not the perpendicular one.
The error ∥b−Pb∥ is the distance from b to W. It is the smallest possible value of ∥b−w∥ over all w∈W.
Projection and Least Squares
When the systemAx=b has no solution — when b is not in the column space of A — the least-squares solution x^ produces the projection of b onto Col(A):
Ax^=b^=Pb
The least-squares solution does not solve Ax=b. It solves Ax=b^, where b^ is the closest reachable vector to b.
The residual r=b−Ax^ lies in Col(A)⊥=Null(AT) — it is orthogonal to every column of A. The condition ATr=0 is exactly the normal equation ATAx^=ATb.
Every least-squares problem is a projection problem. Solving least squares means projecting the target b onto the column space and finding the input x^ that produces the projected output.
Worked Example: Full Projection Computation
Project b=(1,2,3) onto the subspace W=Span{(1,0,1),(0,1,1)} in R3.
The basis is not orthogonal: (1,0,1)⋅(0,1,1)=0+0+1=1=0. Use the general formula. Set A=101011.
ATA=(100111)101011=(2112)
(ATA)−1=31(2−1−12)
ATb=(1+0+30+2+3)=(45)
x^=31(2−1−12)(45)=31(36)=(12)
b^=Ax^=1101+2011=123
The projection equals b itself — meaning b was already in W. The residual is 0, confirming b∈Span{(1,0,1),(0,1,1)}. Indeed: (1,2,3)=1⋅(1,0,1)+2⋅(0,1,1).