Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Lower-Upper Decomposition






Gaussian Elimination as a Matrix Factorization

The LU decomposition factors a square matrix into a lower triangular factor L and an upper triangular factor U. The upper factor is the row echelon form; the lower factor stores the multipliers used to get there. Once computed, the factorization converts every subsequent system solve into two cheap triangular substitutions — making LU the workhorse of direct linear system solvers.



What LU Decomposition Is

The LU decomposition writes an n×nn \times n matrix AA as

A=LUA = LU


where LL is lower triangular with ones on the diagonal (unit lower triangular) and UU is upper triangular. The matrix UU is the row echelon form of AA, and LL stores the multipliers that Gaussian elimination used to produce it.

The factorization captures the entire elimination process in a reusable form. Instead of performing elimination from scratch for every new right-hand side b\mathbf{b}, the work is done once (producing LL and UU) and reused cheaply for each solve.

Construction from Gaussian Elimination

Forward elimination on AA applies a sequence of row-addition operations, each represented by an elementary matrix EiE_i. The product EkE2E1A=UE_k \cdots E_2 E_1 A = U reduces AA to upper triangular form.

Rearranging: A=E11E21Ek1UA = E_1^{-1} E_2^{-1} \cdots E_k^{-1} U. Each EiE_i is a lower triangular matrix that adds a multiple of one row to a row below. Its inverse simply negates the multiplier. The product L=E11E21Ek1L = E_1^{-1} E_2^{-1} \cdots E_k^{-1} is lower triangular, with the multipliers sitting in the sub-diagonal positions.

Worked Example


A=(211402253)A = \begin{pmatrix} 2 & 1 & -1 \\ 4 & 0 & 2 \\ -2 & 5 & 3 \end{pmatrix}


Subtract 22 times row 11 from row 22 (multiplier m21=2m_{21} = 2) and add row 11 to row 33 (multiplier m31=1m_{31} = -1):

(211024062)\begin{pmatrix} 2 & 1 & -1 \\ 0 & -2 & 4 \\ 0 & 6 & 2 \end{pmatrix}


Add 33 times row 22 to row 33 (multiplier m32=3m_{32} = -3):

U=(2110240014)U = \begin{pmatrix} 2 & 1 & -1 \\ 0 & -2 & 4 \\ 0 & 0 & 14 \end{pmatrix}


The multipliers fill LL: L=(100210131)L = \begin{pmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ -1 & -3 & 1 \end{pmatrix}.

Verification: LU=(211402253)=ALU = \begin{pmatrix} 2 & 1 & -1 \\ 4 & 0 & 2 \\ -2 & 5 & 3 \end{pmatrix} = A.

The Structure of L and U

The lower factor LL is unit lower triangular: ones on the diagonal and multipliers below. Entry lijl_{ij} (with i>ji > j) is the multiplier used to eliminate position (i,j)(i, j) during forward elimination. The diagonal is always ones because no row scaling is performed — only row additions.

The upper factor UU is the echelon form: pivots on the diagonal, zeros below, and the result of all elimination steps. The diagonal entries of UU are the pivots, and their product gives the determinant: det(A)=det(L)det(U)=1u11u22unn\det(A) = \det(L)\det(U) = 1 \cdot u_{11}u_{22}\cdots u_{nn}.

The factorization stores LL and UU compactly. Since LL has ones on the diagonal (which need not be stored) and UU has zeros below the diagonal (which need not be stored), both factors fit in a single n×nn \times n array: the lower triangle holds the multipliers and the upper triangle holds UU.

When LU Exists Without Pivoting

The factorization A=LUA = LU without row swaps exists if and only if every leading principal submatrix of AA is nonsingular. The kk-th leading principal submatrix is the upper-left k×kk \times k block of AA, and its determinant must be nonzero for k=1,2,,nk = 1, 2, \dots, n.

When a zero appears in a pivot position during elimination, no row-addition operation can produce a nonzero pivot — a row swap is required. The simple A=LUA = LU factorization breaks down, and the pivoted version PA=LUPA = LU is needed.

For example, A=(0110)A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} has no LULU factorization without pivoting: the (1,1)(1,1) entry is zero, and no multiple of row 11 can create a nonzero pivot. But after swapping the two rows, elimination proceeds immediately.

Partial Pivoting: PA = LU

Partial pivoting modifies the elimination process: at each step, the row with the largest absolute value in the current pivot column (among rows at or below the pivot position) is swapped into the pivot position. All row swaps are recorded in a permutation matrix PP.

The factorization becomes PA=LUPA = LU, where PP is the product of all row-swap permutation matrices. This factorization exists for every invertible matrix — partial pivoting eliminates the restriction on leading principal submatrices.

Partial pivoting also improves numerical stability. Small pivots amplify rounding errors (dividing by a number near zero magnifies imprecision), and selecting the largest available pivot keeps the multipliers in LL bounded by 11 in absolute value, limiting error accumulation.

In numerical software, LU with partial pivoting is the default — the unpivoted version A=LUA = LU is a theoretical simplification that is rarely used in practice.

Solving Systems with LU

Given PA=LUPA = LU, the system Ax=bA\mathbf{x} = \mathbf{b} is solved in two steps.

Forward substitution: solve Ly=PbL\mathbf{y} = P\mathbf{b} for y\mathbf{y}. Since LL is lower triangular, this starts at the top and works downward — each equation involves one new unknown plus previously solved values. Cost: O(n2)O(n^2).

Back substitution: solve Ux=yU\mathbf{x} = \mathbf{y} for x\mathbf{x}. Since UU is upper triangular, this starts at the bottom and works upward. Cost: O(n2)O(n^2).

The factorization itself costs 23n3\frac{2}{3}n^3 operations. Each subsequent solve costs O(n2)O(n^2). For kk systems with the same coefficient matrix but different right-hand sides b1,,bk\mathbf{b}_1, \dots, \mathbf{b}_k: factor once at cost 23n3\frac{2}{3}n^3, then solve kk times at cost kn2kn^2. When kk is large, the per-system cost drops to essentially O(n2)O(n^2) — far cheaper than kk independent eliminations at 23n3\frac{2}{3}n^3 each.

LU and the Determinant

The determinant of AA is a free byproduct of the LU factorization. Since det(L)=1\det(L) = 1 (the diagonal is all ones) and det(U)=u11u22unn\det(U) = u_{11}u_{22}\cdots u_{nn} (the product of the diagonal for a triangular matrix):

det(A)=det(L)det(U)=u11u22unn\det(A) = \det(L)\det(U) = u_{11}u_{22}\cdots u_{nn}


With pivoting, each row swap flips the sign: det(A)=(1)su11u22unn\det(A) = (-1)^s u_{11}u_{22}\cdots u_{nn}, where ss is the number of row swaps recorded in PP.

This makes determinant computation essentially free once LU is available — just multiply the diagonal of UU and account for the sign. It is far more efficient than cofactor expansion, and it is the method every numerical library uses.

LU and the Inverse

To compute A1A^{-1}, solve Axj=ejA\mathbf{x}_j = \mathbf{e}_j for each standard basis vector ej\mathbf{e}_j, j=1,,nj = 1, \dots, n. The solutions x1,,xn\mathbf{x}_1, \dots, \mathbf{x}_n are the columns of A1A^{-1}.

With LU: factor AA once (23n3\frac{2}{3}n^3), then solve nn systems (n×O(n2)=O(n3)n \times O(n^2) = O(n^3)). Total cost: roughly 83n3\frac{8}{3}n^3 — cheaper than nn independent eliminations.

In practice, computing A1A^{-1} explicitly is rarely the right approach. Solving Ax=bA\mathbf{x} = \mathbf{b} via LU is faster than multiplying A1bA^{-1}\mathbf{b}, and more numerically stable. The inverse formula is primarily a theoretical tool; for computation, LU solves are preferred.

Worked Example: 4×4 with Pivoting

A=(0213103131022310)A = \begin{pmatrix} 0 & 2 & 1 & 3 \\ 1 & 0 & 3 & -1 \\ 3 & 1 & 0 & 2 \\ 2 & 3 & 1 & 0 \end{pmatrix}


The (1,1)(1,1) entry is zero — a row swap is needed. The largest entry in column 11 is 33 in row 33. Swap rows 11 and 33:

(3102103102132310)\begin{pmatrix} 3 & 1 & 0 & 2 \\ 1 & 0 & 3 & -1 \\ 0 & 2 & 1 & 3 \\ 2 & 3 & 1 & 0 \end{pmatrix}


Eliminate below the (1,1)(1,1) pivot. Multipliers: m21=1/3m_{21} = 1/3, m31=0m_{31} = 0, m41=2/3m_{41} = 2/3:

(310201/335/3021307/314/3)\begin{pmatrix} 3 & 1 & 0 & 2 \\ 0 & -1/3 & 3 & -5/3 \\ 0 & 2 & 1 & 3 \\ 0 & 7/3 & 1 & -4/3 \end{pmatrix}


The largest entry in column 22 below position (2,2)(2,2) is 7/37/3 in row 44. Swap rows 22 and 44. Continue elimination on columns 22 and 33 with the appropriate multipliers.

After completing all steps, the factorization PA=LUPA = LU is assembled with PP recording both row swaps, LL storing all multipliers, and UU storing the final upper triangular result. Verification: LU=PALU = PA.

Computational Cost

The LU factorization of an n×nn \times n matrix requires roughly 23n3\frac{2}{3}n^3 arithmetic operations (multiplications and additions). Each triangular solve (forward or back substitution) requires roughly n2n^2 operations.

For a single system Ax=bA\mathbf{x} = \mathbf{b}, LU costs about the same as Gaussian elimination applied directly. The advantage appears with multiple right-hand sides: kk systems sharing the same AA cost 23n3+2kn2\frac{2}{3}n^3 + 2kn^2, versus 23kn3\frac{2}{3}kn^3 for kk independent eliminations.

Compared to alternatives: Cramer's rule costs O(nn!)O(n \cdot n!) — absurdly expensive. Explicit inverse computation costs roughly 2n32n^3. The Cholesky factorization costs 13n3\frac{1}{3}n^3 but requires symmetry and positive definiteness. LU is the general-purpose baseline — the standard direct solver for dense linear systems in scientific computing.