Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Cholesky Decomposition






The Square Root of a Symmetric Positive Definite Matrix

The Cholesky decomposition factors a symmetric positive definite matrix as A = LLᵀ, where L is lower triangular with positive diagonal entries. It exploits symmetry to achieve half the cost of LU, requires no pivoting, and doubles as a positive definiteness test. It is the fastest direct solver for the class of matrices that appears most often in optimization, statistics, and simulation.



What Cholesky Decomposition Is

The Cholesky decomposition writes a symmetric positive definite matrix AA as

A=LLTA = LL^T


where LL is lower triangular with strictly positive diagonal entries. The matrix LL is the Cholesky factor — the matrix "square root" of AA in the sense that AA is reconstructed by multiplying LL by its own transpose.

The factorization is unique: for a given symmetric positive definite AA, there is exactly one lower triangular LL with positive diagonal satisfying A=LLTA = LL^T. The convention can also be written A=RTRA = R^TR with R=LTR = L^T upper triangular — the choice between lower and upper is a matter of convention.

Existence: Symmetric Positive Definite Matrices

The Cholesky factorization exists if and only if AA is symmetric and positive definite. Symmetry means A=ATA = A^T. Positive definiteness means xTAx>0\mathbf{x}^TA\mathbf{x} > 0 for every nonzero vector x\mathbf{x}.

Several equivalent conditions characterize positive definiteness: all eigenvalues of AA are strictly positive; all leading principal minors (the determinants of the upper-left k×kk \times k submatrices) are positive; all pivots in Gaussian elimination (without pivoting) are positive.

If either symmetry or positive definiteness fails, the Cholesky algorithm breaks down. A non-symmetric matrix has no LLTLL^T form. A symmetric matrix that is positive semi-definite (some eigenvalue is zero) produces a zero on the diagonal of LL. An indefinite matrix (eigenvalues of both signs) produces a negative number under the square root, halting the algorithm.

The Algorithm

The Cholesky factor LL is computed column by column, from left to right.

For column jj, the diagonal entry is

ljj=ajjk=1j1ljk2l_{jj} = \sqrt{a_{jj} - \sum_{k=1}^{j-1} l_{jk}^2}


and the sub-diagonal entries (for i>ji > j) are

lij=1ljj(aijk=1j1likljk)l_{ij} = \frac{1}{l_{jj}}\left(a_{ij} - \sum_{k=1}^{j-1} l_{ik}l_{jk}\right)


The square root in the diagonal formula is always real and positive because positive definiteness guarantees that the expression under the root is strictly positive at every step.

Worked Example


For A=(422251216)A = \begin{pmatrix} 4 & 2 & -2 \\ 2 & 5 & 1 \\ -2 & 1 & 6 \end{pmatrix}:

l11=4=2l_{11} = \sqrt{4} = 2. l21=2/2=1l_{21} = 2/2 = 1. l31=2/2=1l_{31} = -2/2 = -1.

l22=512=4=2l_{22} = \sqrt{5 - 1^2} = \sqrt{4} = 2. l32=(1(1)(1))/2=2/2=1l_{32} = (1 - (-1)(1))/2 = 2/2 = 1.

l33=6(1)212=4=2l_{33} = \sqrt{6 - (-1)^2 - 1^2} = \sqrt{4} = 2.

L=(200120112)L = \begin{pmatrix} 2 & 0 & 0 \\ 1 & 2 & 0 \\ -1 & 1 & 2 \end{pmatrix}


Verification: LLT=(422251216)=ALL^T = \begin{pmatrix} 4 & 2 & -2 \\ 2 & 5 & 1 \\ -2 & 1 & 6 \end{pmatrix} = A.

Solving Systems with Cholesky

Given A=LLTA = LL^T, the system Ax=bA\mathbf{x} = \mathbf{b} is solved in two steps:

Forward substitution: solve Ly=bL\mathbf{y} = \mathbf{b} for y\mathbf{y}. Cost: O(n2)O(n^2).

Back substitution: solve LTx=yL^T\mathbf{x} = \mathbf{y} for x\mathbf{x}. Cost: O(n2)O(n^2).

The structure is identical to LU, but only one triangular factor needs to be stored. The other is its transpose, which costs nothing extra — the same array of numbers is read in reverse order.

Computational Cost

The Cholesky factorization requires roughly 13n3\frac{1}{3}n^3 arithmetic operations — exactly half the cost of LU factorization (23n3\frac{2}{3}n^3). The saving comes from symmetry: the lower triangle of AA determines the upper triangle (aij=ajia_{ij} = a_{ji}), so only half the entries need processing.

Each triangular solve costs O(n2)O(n^2). For a single system, the total is 13n3+O(n2)\frac{1}{3}n^3 + O(n^2). For kk systems sharing the same coefficient matrix, the cost is 13n3+2kn2\frac{1}{3}n^3 + 2kn^2.

For large symmetric positive definite systems, Cholesky is the fastest direct solver available. It is roughly twice as fast as LU and requires roughly half the storage (only the lower triangle of LL).

No Pivoting Needed

Unlike LU, the Cholesky algorithm never requires row swaps. Positive definiteness guarantees that the quantity under the square root is strictly positive at every step — no zero or negative pivots can occur.

This makes the algorithm simpler (no permutation matrix to track), more stable (no near-zero pivots to amplify errors), and more predictable (the algorithm either runs to completion or breaks down, with no ambiguity).

If the algorithm encounters a non-positive value under the square root, the matrix is not positive definite. The breakdown happens at the first index jj where the leading j×jj \times j principal submatrix fails to be positive definite. This makes Cholesky an efficient positive definiteness test: attempt the factorization and check whether it succeeds.

Cholesky as a Positive Definiteness Test

The Cholesky algorithm tests positive definiteness as a side effect of factoring. If A=LLTA = LL^T completes with all diagonal entries of LL strictly positive, AA is positive definite. If the algorithm breaks down at any step, AA is not positive definite.

This is often more practical than computing all eigenvalues (also O(n3)O(n^3) but with a larger constant) or checking all leading principal minors (which requires nn determinant computations).

The test is binary: the factorization either succeeds or fails. When it succeeds, LL is available for immediate use in system solving. When it fails, the index at which it breaks indicates which leading submatrix is the first to lose positive definiteness.

Where Cholesky Appears

Symmetric positive definite matrices appear throughout applied mathematics, and Cholesky is the default solver for all of them.

The normal equations ATAx^=ATbA^TA\hat{\mathbf{x}} = A^T\mathbf{b} produce a symmetric positive definite matrix ATAA^TA (when AA has full column rank). Cholesky on ATAA^TA solves least squares, though QR is preferred for numerical stability.

Covariance matrices in statistics are symmetric positive semi-definite (positive definite when no variable is a linear combination of the others). Cholesky is used to sample from multivariate normal distributions: if Σ=LLT\Sigma = LL^T, then LzL\mathbf{z} (with z\mathbf{z} standard normal) has distribution N(0,Σ)\mathcal{N}(\mathbf{0}, \Sigma).

Stiffness matrices in finite element analysis are symmetric positive definite. Newton's method in optimization solves Hd=fH\mathbf{d} = -\nabla f where HH (the Hessian) is positive definite at a strict local minimum. In both cases, Cholesky is the standard solver.

Cholesky and LU: The Relationship

For a symmetric positive definite matrix, LU factorization without pivoting always succeeds and produces A=LUDLUTA = L_U D L_U^T, where LUL_U is unit lower triangular and DD is diagonal with positive entries. This is the LDLTLDL^T decomposition.

The Cholesky factor is LChol=LUD1/2L_{\text{Chol}} = L_U D^{1/2} — the unit lower triangular factor with D\sqrt{D} absorbed into it. Equivalently, A=(LUD1/2)(LUD1/2)T=LLTA = (L_U D^{1/2})(L_U D^{1/2})^T = LL^T.

Cholesky is the symmetric-positive-definite specialization of LU. It exploits A=ATA = A^T to halve the work and the storage, and it exploits positive definiteness to eliminate the need for pivoting. The LDLTLDL^T variant avoids square roots entirely (computing DD and LUL_U separately) and is sometimes preferred when the square roots are expensive or when the matrix is only positive semi-definite.