Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Gram Schmidt Process






Converting Any Basis into an Orthogonal One

The Gram-Schmidt process takes a set of linearly independent vectors and produces an orthogonal (or orthonormal) set spanning the same subspace. It works by sequentially stripping each vector of its components along previously computed directions, leaving only the perpendicular remainder. The result is a constructive proof that orthonormal bases always exist — and the matrix version of this process is the QR decomposition.



The Goal

The input is a set of linearly independent vectors {v1,v2,,vk}\{\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_k\} in an inner product space. The output is an orthogonal set {u1,u2,,uk}\{\mathbf{u}_1, \mathbf{u}_2, \dots, \mathbf{u}_k\} satisfying two conditions: the vectors are pairwise perpendicular (uiuj=0\mathbf{u}_i \cdot \mathbf{u}_j = 0 for iji \neq j), and they span the same subspace (Span{u1,,uj}=Span{v1,,vj}\text{Span}\{\mathbf{u}_1, \dots, \mathbf{u}_j\} = \text{Span}\{\mathbf{v}_1, \dots, \mathbf{v}_j\} at every step jj).

Optionally, each ui\mathbf{u}_i is normalized to unit length, producing an orthonormal set {q1,,qk}\{\mathbf{q}_1, \dots, \mathbf{q}_k\}.

The process is the constructive proof that every finite-dimensional inner product space has an orthonormal basis. Given any basis, Gram-Schmidt produces an orthonormal one for the same space.

The Algorithm: Two Vectors

Start with two independent vectors v1\mathbf{v}_1 and v2\mathbf{v}_2.

Set u1=v1\mathbf{u}_1 = \mathbf{v}_1.

Subtract the projection of v2\mathbf{v}_2 onto u1\mathbf{u}_1:

u2=v2u1v2u1u1u1\mathbf{u}_2 = \mathbf{v}_2 - \frac{\mathbf{u}_1 \cdot \mathbf{v}_2}{\mathbf{u}_1 \cdot \mathbf{u}_1}\,\mathbf{u}_1


The subtracted term is the component of v2\mathbf{v}_2 along u1\mathbf{u}_1. Removing it leaves only the component perpendicular to u1\mathbf{u}_1, so u1u2=0\mathbf{u}_1 \cdot \mathbf{u}_2 = 0.

Worked Example


v1=(1,1,0)\mathbf{v}_1 = (1, 1, 0), v2=(1,0,1)\mathbf{v}_2 = (1, 0, 1).

u1=(1,1,0)\mathbf{u}_1 = (1, 1, 0). u1v2u1u1=12\frac{\mathbf{u}_1 \cdot \mathbf{v}_2}{\mathbf{u}_1 \cdot \mathbf{u}_1} = \frac{1}{2}. u2=(1,0,1)12(1,1,0)=(12,12,1)\mathbf{u}_2 = (1, 0, 1) - \frac{1}{2}(1, 1, 0) = (\frac{1}{2}, -\frac{1}{2}, 1).

Check: u1u2=1212+0=0\mathbf{u}_1 \cdot \mathbf{u}_2 = \frac{1}{2} - \frac{1}{2} + 0 = 0.

The Algorithm: General Case

For kk independent vectors v1,,vk\mathbf{v}_1, \dots, \mathbf{v}_k:

u1=v1\mathbf{u}_1 = \mathbf{v}_1


uj=vji=1j1uivjuiuiuifor j=2,3,,k\mathbf{u}_j = \mathbf{v}_j - \sum_{i=1}^{j-1} \frac{\mathbf{u}_i \cdot \mathbf{v}_j}{\mathbf{u}_i \cdot \mathbf{u}_i}\,\mathbf{u}_i \qquad \text{for } j = 2, 3, \dots, k


At each step, vj\mathbf{v}_j has its projections onto all previously computed orthogonal vectors subtracted. What remains is the component of vj\mathbf{v}_j perpendicular to Span{u1,,uj1}\text{Span}\{\mathbf{u}_1, \dots, \mathbf{u}_{j-1}\}.

Because vj\mathbf{v}_j is independent of {v1,,vj1}\{\mathbf{v}_1, \dots, \mathbf{v}_{j-1}\} — and therefore not in Span{u1,,uj1}\text{Span}\{\mathbf{u}_1, \dots, \mathbf{u}_{j-1}\} — this perpendicular component is nonzero. So each uj0\mathbf{u}_j \neq \mathbf{0}, and the process never breaks down.

At every stage, Span{u1,,uj}=Span{v1,,vj}\text{Span}\{\mathbf{u}_1, \dots, \mathbf{u}_j\} = \text{Span}\{\mathbf{v}_1, \dots, \mathbf{v}_j\}. The span is preserved because each uj\mathbf{u}_j is a linear combination of vj\mathbf{v}_j and the earlier ui\mathbf{u}_i's (which are themselves combinations of v1,,vj1\mathbf{v}_1, \dots, \mathbf{v}_{j-1}).

Normalization

After computing the orthogonal set {u1,,uk}\{\mathbf{u}_1, \dots, \mathbf{u}_k\}, normalization produces an orthonormal set:

qi=uiui\mathbf{q}_i = \frac{\mathbf{u}_i}{\|\mathbf{u}_i\|}


Each vector is divided by its length, making qi=1\|\mathbf{q}_i\| = 1 while preserving direction. The resulting set satisfies qiqj=δij\mathbf{q}_i \cdot \mathbf{q}_j = \delta_{ij}.

Normalization can be done at each step (normalize uj\mathbf{u}_j immediately before moving to vj+1\mathbf{v}_{j+1}) or all at the end. The result is the same in exact arithmetic. In practice, normalizing at each step is slightly preferable for numerical stability, as it keeps the vectors well-scaled throughout the computation.

Worked Example: Three Vectors in R³

Orthogonalize v1=(1,1,1)\mathbf{v}_1 = (1, 1, 1), v2=(1,0,1)\mathbf{v}_2 = (1, 0, 1), v3=(0,1,1)\mathbf{v}_3 = (0, 1, 1).

u1=(1,1,1)\mathbf{u}_1 = (1, 1, 1).

For u2\mathbf{u}_2: u1v2u1u1=1+0+13=23\frac{\mathbf{u}_1 \cdot \mathbf{v}_2}{\mathbf{u}_1 \cdot \mathbf{u}_1} = \frac{1 + 0 + 1}{3} = \frac{2}{3}. u2=(1,0,1)23(1,1,1)=(13,23,13)\mathbf{u}_2 = (1, 0, 1) - \frac{2}{3}(1, 1, 1) = (\frac{1}{3}, -\frac{2}{3}, \frac{1}{3}).

Check: u1u2=1323+13=0\mathbf{u}_1 \cdot \mathbf{u}_2 = \frac{1}{3} - \frac{2}{3} + \frac{1}{3} = 0.

For u3\mathbf{u}_3: u1v3u1u1=0+1+13=23\frac{\mathbf{u}_1 \cdot \mathbf{v}_3}{\mathbf{u}_1 \cdot \mathbf{u}_1} = \frac{0 + 1 + 1}{3} = \frac{2}{3} and u2v3u2u2=023+1319+49+19=1369=12\frac{\mathbf{u}_2 \cdot \mathbf{v}_3}{\mathbf{u}_2 \cdot \mathbf{u}_2} = \frac{0 - \frac{2}{3} + \frac{1}{3}}{\frac{1}{9} + \frac{4}{9} + \frac{1}{9}} = \frac{-\frac{1}{3}}{\frac{6}{9}} = \frac{-1}{2}.

u3=(0,1,1)23(1,1,1)(12)(13,23,13)=(0,1,1)(23,23,23)+(16,13,16)=(12,0,12)\mathbf{u}_3 = (0, 1, 1) - \frac{2}{3}(1, 1, 1) - (-\frac{1}{2})(\frac{1}{3}, -\frac{2}{3}, \frac{1}{3}) = (0, 1, 1) - (\frac{2}{3}, \frac{2}{3}, \frac{2}{3}) + (\frac{1}{6}, -\frac{1}{3}, \frac{1}{6}) = (-\frac{1}{2}, 0, \frac{1}{2}).

Check: u1u3=12+0+12=0\mathbf{u}_1 \cdot \mathbf{u}_3 = -\frac{1}{2} + 0 + \frac{1}{2} = 0 and u2u3=16+0+16=0\mathbf{u}_2 \cdot \mathbf{u}_3 = -\frac{1}{6} + 0 + \frac{1}{6} = 0.

Normalizing: u1=3\|\mathbf{u}_1\| = \sqrt{3}, u2=6/3\|\mathbf{u}_2\| = \sqrt{6}/3, u3=1/2\|\mathbf{u}_3\| = 1/\sqrt{2}. The orthonormal basis is {u1/3,  3u2/6,  2u3}\{\mathbf{u}_1/\sqrt{3}, \; 3\mathbf{u}_2/\sqrt{6}, \; \sqrt{2}\,\mathbf{u}_3\}.

Why It Works

At each step, uj\mathbf{u}_j is constructed as vj\mathbf{v}_j minus everything in vj\mathbf{v}_j that lies in the subspace Wj1=Span{u1,,uj1}W_{j-1} = \text{Span}\{\mathbf{u}_1, \dots, \mathbf{u}_{j-1}\}. Since {u1,,uj1}\{\mathbf{u}_1, \dots, \mathbf{u}_{j-1}\} is orthogonal, the projection formula decomposes cleanly into independent terms — one projection per basis vector.

What remains after subtraction is the component of vj\mathbf{v}_j orthogonal to Wj1W_{j-1}. This component is nonzero because vjWj1\mathbf{v}_j \notin W_{j-1} — guaranteed by the independence of the original set.

The span is preserved at each step. Each uj\mathbf{u}_j is a linear combination of vj\mathbf{v}_j and u1,,uj1\mathbf{u}_1, \dots, \mathbf{u}_{j-1}, and each earlier ui\mathbf{u}_i is a combination of v1,,vi\mathbf{v}_1, \dots, \mathbf{v}_i. So ujSpan{v1,,vj}\mathbf{u}_j \in \text{Span}\{\mathbf{v}_1, \dots, \mathbf{v}_j\}, and the reverse inclusion follows because vj\mathbf{v}_j can be recovered from uj\mathbf{u}_j and the earlier ui\mathbf{u}_i's.

The QR Decomposition

Applying Gram-Schmidt to the columns a1,,an\mathbf{a}_1, \dots, \mathbf{a}_n of an m×nm \times n matrix AA (with independent columns) produces the QR decomposition:

A=QRA = QR


QQ is m×nm \times n with orthonormal columns (the normalized qi\mathbf{q}_i's). RR is n×nn \times n upper triangular with positive diagonal entries.

The entries of RR are the dot products computed during Gram-Schmidt: Rij=qiajR_{ij} = \mathbf{q}_i \cdot \mathbf{a}_j for iji \leq j, and Rij=0R_{ij} = 0 for i>ji > j (because aj\mathbf{a}_j's projection onto qi\mathbf{q}_i is zero when i>ji > j — that direction hasn't been subtracted yet).

The factorization captures two complementary pieces of information. QQ stores the orthonormal directions. RR stores the coefficients that express the original columns in terms of those directions: aj=R1jq1+R2jq2++Rjjqj\mathbf{a}_j = R_{1j}\mathbf{q}_1 + R_{2j}\mathbf{q}_2 + \cdots + R_{jj}\mathbf{q}_j.
Component of A = Q R Description Properties / formula
Q  (m × n) columns are the normalized Gram–Schmidt outputs qj = uj / ‖uj Qᵀ Q = In; Q Qᵀ is the projection matrix onto Col(A)
R  (n × n) upper triangular matrix of coefficients tying the original columns back to the qi's Rij = qi · aj for i ≤ j; Rij = 0 for i > j
Diagonal entry Rjj the length of the j-th Gram–Schmidt output Rjj = ‖uj‖ > 0; positivity gives uniqueness of QR
Off-diagonal entry Rij  (i < j) dot products computed during Gram–Schmidt Rij = qi · aj — coefficient of qi in the expansion of aj
Column-recovery formula how each original column aj is recovered from Q and R aj = R1j q1 + R2j q2 + … + Rjj qj

QR and Least Squares

The QR decomposition provides a numerically superior method for solving least-squares problems.

The normal equations ATAx^=ATbA^TA\hat{\mathbf{x}} = A^T\mathbf{b} can be rewritten using A=QRA = QR. Since ATA=RTQTQR=RTRA^TA = R^TQ^TQR = R^TR and ATb=RTQTbA^T\mathbf{b} = R^TQ^T\mathbf{b}, the normal equations become RTRx^=RTQTbR^TR\hat{\mathbf{x}} = R^TQ^T\mathbf{b}. Canceling RTR^T (which is invertible since RR has positive diagonal):

Rx^=QTbR\hat{\mathbf{x}} = Q^T\mathbf{b}


This is an upper triangular system, solved by back substitution. The computation QTbQ^T\mathbf{b} is just nn dot products (one per column of QQ).

The QR approach avoids forming ATAA^TA explicitly. This matters because the condition number of ATAA^TA is the square of the condition number of AA — squaring amplifies rounding errors. Working with QQ and RR directly preserves the original conditioning and is the standard method for least squares in numerical software.

Numerical Stability

Classical Gram-Schmidt (as presented above) can lose orthogonality in floating-point arithmetic. When the input vectors are nearly dependent, the computed uj\mathbf{u}_j's may fail to be perpendicular to machine precision, and the errors accumulate with each step.

Modified Gram-Schmidt addresses this by reorganizing the computation. Instead of computing all projections using the original vj\mathbf{v}_j, modified Gram-Schmidt updates vj\mathbf{v}_j in place after each projection is subtracted. At step jj, first subtract the projection onto u1\mathbf{u}_1 from vj\mathbf{v}_j, then subtract the projection onto u2\mathbf{u}_2 from the updated vj\mathbf{v}_j, and so on. The mathematical result is identical in exact arithmetic, but the modified version is significantly more stable numerically.

Householder reflections provide an even more robust alternative for computing the QR factorization. Householder-based QR achieves backward stability — the gold standard in numerical linear algebra — and is the default algorithm in most software libraries.
Method How it computes the orthogonal set / QR Numerical stability Typical use
Classical Gram–Schmidt at step j, all projections of the original vj onto u1, …, uj−1 are computed and subtracted at once poor — orthogonality can be lost when input vectors are nearly dependent; errors accumulate textbook / illustrative use; abstract inner product spaces where exact arithmetic is available
Modified Gram–Schmidt vj is updated in place after each individual projection is subtracted significantly better than classical; identical result in exact arithmetic educational software, moderate-precision floating-point work
Householder reflections a sequence of orthogonal reflections is applied to A to zero out below-diagonal entries column by column backward stable — the gold standard in numerical linear algebra default QR routine in LAPACK, NumPy, MATLAB, and most numerical libraries
Givens rotations a sequence of 2 × 2 plane rotations zeros out individual entries one at a time backward stable; lets the algorithm target nonzero entries only sparse matrices with structured zeros; updating an existing QR factorization

Gram-Schmidt on Abstract Inner Product Spaces

The algorithm works in any inner product space — the dot product is replaced by the general inner product ,\langle \cdot, \cdot \rangle, and the formulas are otherwise identical:

uj=vji=1j1ui,vjui,uiui\mathbf{u}_j = \mathbf{v}_j - \sum_{i=1}^{j-1} \frac{\langle \mathbf{u}_i, \mathbf{v}_j \rangle}{\langle \mathbf{u}_i, \mathbf{u}_i \rangle}\,\mathbf{u}_i


Orthogonalizing {1,x,x2}\{1, x, x^2\} in the polynomial space P2\mathcal{P}_2 with the inner product p,q=11p(x)q(x)dx\langle p, q \rangle = \int_{-1}^{1} p(x)q(x)\,dx produces the Legendre polynomials (up to normalization): P0(x)=1P_0(x) = 1, P1(x)=xP_1(x) = x, P2(x)=12(3x21)P_2(x) = \frac{1}{2}(3x^2 - 1). The polynomial xx is already orthogonal to 11 under this inner product (the integral of an odd function over a symmetric interval is zero), so the first subtraction has no effect.

On the function space C[0,2π]C[0, 2\pi] with f,g=02πf(x)g(x)dx\langle f, g \rangle = \int_0^{2\pi} f(x)g(x)\,dx, orthogonalizing appropriate function sets produces Fourier bases. The algorithm is identical in structure to the Rn\mathbb{R}^n version — only the inner product changes.

Summary: Gram-Schmidt Across Settings

Gram-Schmidt operates uniformly across many settings — ℝⁿ with the standard dot product, polynomial spaces with integral inner products, function spaces in Fourier analysis, and abstract inner product spaces in general. The table below collects the main contexts in which the process is applied, alongside the input, the inner product used, and what the algorithm produces in each case.
Input Inner product / setting Output of Gram–Schmidt What is obtained
Independent vectors v1, …, vk in ℝⁿ standard dot product u · v orthogonal set {u1, …, uk} with same span at every step orthogonal basis for Span{v1, …, vk}
Same input, plus normalization step qj = uj / ‖uj standard dot product orthonormal set {q1, …, qk} with qi · qj = δij orthonormal basis for the same subspace
Columns of a full-column-rank matrix A (m × n) standard dot product on column vectors Q (m × n, orthonormal columns) and R (n × n, upper triangular, positive diagonal) thin QR decomposition A = Q R
{1, x, x²} in 𝒫2 ⟨p, q⟩ = ∫−11 p(x) q(x) dx P0(x) = 1, P1(x) = x, P2(x) = ½(3x² − 1) (up to normalization) the first three Legendre polynomials
Selected functions in C[0, 2π] ⟨f, g⟩ = ∫0 f(x) g(x) dx sines and cosines remain orthogonal across the chosen frequencies orthogonal basis underlying Fourier analysis
Independent v1, …, vk in any inner product space V general ⟨·, ·⟩ orthonormal {q1, …, qk} with same span constructive proof that every finite-dimensional inner product space has an orthonormal basis