Matrices support a family of operations — addition, scalar multiplication, matrix multiplication, transposition, and exponentiation — each with its own rules and dimension requirements. Matrix multiplication stands apart from the rest: it is not commutative, it demands compatible dimensions, and it admits several geometric and algebraic interpretations that make it one of the richest operations in all of mathematics.
Matrix Addition
Two matrices of the same size can be added entry by entry. If A and B are both m×n, their sum is the m×n matrix with entries
(A+B)ij=aij+bij
For example,
1−20435+362−10−4=442331
If the dimensions do not match, the sum is undefined — there is no way to add a 2×3 matrix to a 3×2 matrix.
Addition is commutative (A+B=B+A) and associative ((A+B)+C=A+(B+C)). The zero matrix O of the same size serves as the additive identity (A+O=A), and the additive inverse of A is −A=(−aij), so A+(−A)=O.
Matrix Subtraction
Subtraction is defined as addition of the negative:
A−B=A+(−B)
Entry by entry, (A−B)ij=aij−bij. The same dimension requirement applies — both matrices must have identical shapes. There is nothing deeper here than combining addition and negation, but it appears often enough to warrant its own notation.
Scalar Multiplication
Multiplying a matrix by a scalar c scales every entry:
(cA)ij=c⋅aij
For example,
−2(1035−42)=(−20−6−108−4)
Scalar multiplication distributes over matrix addition (c(A+B)=cA+cB), distributes over scalar addition ((c+d)A=cA+dA), associates with itself (c(dA)=(cd)A), and has 1 as its identity (1⋅A=A). Multiplying by 0 produces the zero matrix.
Linear Combinations of Matrices
Given matrices A1,A2,…,Ak of the same size and scalars c1,c2,…,ck, the expression
c1A1+c2A2+⋯+ckAk
is a linear combination of matrices. Addition and scalar multiplication together give the set of all m×n matrices the structure of a vector space. The dimension of this space is mn — one degree of freedom for each entry. The standard basis consists of the mn matrices that have a single 1 in one position and zeros everywhere else.
Matrix Multiplication — Definition
For A of size m×n and B of size n×p, the product AB is an m×p matrix whose (i,j) entry is the dot product of row i of A with column j of B:
(AB)ij=k=1∑naikbkj
The number of columns of A must equal the number of rows of B. If this compatibility condition fails, the product is undefined.
Worked Example
(120−134)5201−36
The left matrix is 2×3 and the right is 3×2, so the product is 2×2. Computing each entry:
(1)(5)+(0)(2)+(3)(0)=5,(1)(1)+(0)(−3)+(3)(6)=19
(2)(5)+(−1)(2)+(4)(0)=8,(2)(1)+(−1)(−3)+(4)(6)=29
AB=(581929)
Each entry required n=3 multiplications and n−1=2 additions. The full product required m×p=4 such computations.
Matrix Multiplication — Properties
Matrix multiplication obeys several familiar algebraic rules and violates one that is deeply ingrained from scalar arithmetic.
Associativity holds: (AB)C=A(BC) whenever all products are defined. Distribution holds on both sides: A(B+C)=AB+AC and (A+B)C=AC+BC. Scalars pass through freely: c(AB)=(cA)B=A(cB). The identity matrix satisfies AI=IA=A whenever the dimensions are compatible.
Commutativity, however, fails. In general, AB=BA, even when both products happen to be defined. For a concrete counterexample, take A=(1020) and B=(0304). Then AB=(6080) while BA=(0306).
Two further properties distinguish matrix multiplication from scalar multiplication. The product of two nonzero matrices can be zero: if A=(1224) and B=(2−1−42), then AB=O even though neither A nor B is zero. Cancellation also fails: AB=AC does not imply B=C unless A is invertible.
Matrix Multiplication — Column and Row Interpretations
The entry-by-entry formula is the most common way to define matrix multiplication, but two alternative viewpoints often provide sharper insight.
The column interpretation says that column j of AB is obtained by multiplying A times column j of B:
AB=(Ab1Ab2⋯Abp)
Each column of the product is a linear combination of the columns of A, with weights given by the corresponding column of B. This is the view that connects matrix multiplication to linear transformations: the product AB applies the transformation A to each column of B independently.
The row interpretation says that row i of AB equals row i of A times the entire matrix B. Each row of the product is a linear combination of the rows of B, weighted by the entries in the corresponding row of A.
A third perspective writes the product as a sum of rank-one outer products:
AB=k=1∑n(column k of A)(row k of B)
Each term is an m×p matrix of rank at most one, and their sum is the full product. This decomposition appears in low-rank approximation theory and in the analysis of the singular value decomposition.
The Transpose
The transpose of an m×n matrix A is the n×m matrix AT obtained by converting rows into columns:
(AT)ij=aji
For example,
A=(142536)⟹AT=123456
The transpose satisfies (AT)T=A, distributes over addition ((A+B)T=AT+BT), and commutes with scalar multiplication ((cA)T=cAT). The product rule reverses the order:
(AB)T=BTAT
This reversal is a frequent source of errors and is worth memorizing as a pattern: transposing a product is like reading it backward.
A matrix satisfying A=AT is called symmetric. For any matrix A of any shape, the products ATA and AAT are both symmetric — this is immediate from the product rule, since (ATA)T=AT(AT)T=ATA.
Matrix Powers
For a square matrix A, powers are defined by repeated multiplication:
A0=I,A1=A,Ak=k factorsA⋅A⋯A
The usual exponent laws hold: AjAk=Aj+k and (Aj)k=Ajk. When A is invertible, negative powers are defined as A−k=(A−1)k, extending the exponent laws to all integers.
One rule from scalar arithmetic does not carry over. Since matrix multiplication is not commutative, the identity (AB)k=AkBk is false in general. Expanding (AB)2=ABAB, there is no way to rearrange this into A2B2=AABB without commutativity.
Powers of specific matrix types are particularly well-behaved. For a diagonal matrix D=diag(d1,…,dn), the k-th power is Dk=diag(d1k,…,dnk) — each diagonal entry is raised to the k-th power independently. This simplicity is one of the main reasons diagonalization is so useful: writing A=PDP−1 gives Ak=PDkP−1, reducing an expensive matrix power to a cheap diagonal power.
Elementary Matrices
An elementary matrix is the result of performing a single row operation on the identity matrix. There are three types, corresponding to the three row operations: swapping two rows, multiplying a row by a nonzero scalar, and adding a multiple of one row to another.
The key property is that left-multiplying a matrix A by an elementary matrix E performs the corresponding row operation on A. If E swaps rows 2 and 3 of the identity, then EA swaps rows 2 and 3 of A. If E scales row 1 of the identity by 5, then EA scales row 1 of A by 5.
Every elementary matrix is invertible, and its inverse is another elementary matrix of the same type: the inverse of a row swap is the same row swap, the inverse of scaling by k is scaling by 1/k, and the inverse of adding c times row i to row j is subtracting c times row i from row j.
This leads to a structural result: every invertible matrix can be written as a product of elementary matrices. Since Gaussian elimination reduces an invertible matrix to the identity through a sequence of row operations, each operation corresponds to an elementary matrix, and reversing the sequence expresses the original matrix as their product. This factorization is more conceptual than computational, but it underpins the theoretical foundations of the determinant and the inverse.
Matrix Decompositions
A matrix decomposition (or factorization) expresses a matrix as a product of simpler matrices with known structure. Decompositions are among the most powerful tools in computational linear algebra, converting hard problems into sequences of easy ones.
The LU decomposition writes A=LU where L is lower triangular and U is upper triangular. It captures the essence of Gaussian elimination in matrix form and makes solving linear systems with multiple right-hand sides efficient: once L and U are known, each system reduces to two triangular solves.
The QR decomposition writes A=QR where Q is orthogonal and R is upper triangular. It is the foundation of least-squares computation and several eigenvalue algorithms.
The Cholesky decomposition writes A=LLT for symmetric positive definite matrices, achieving the work of LU in roughly half the computation by exploiting symmetry.
The eigendecomposition writes A=PDP−1 where D is diagonal, placing the eigenvalues on the diagonal and the eigenvectors in the columns of P. It applies only to diagonalizable matrices.
The singular value decomposition writes A=UΣVT where U and V are orthogonal and Σ is diagonal with nonnegative entries. Unlike the eigendecomposition, the SVD exists for every matrix of every shape. It reveals the rank, the fundamental subspaces, and the best low-rank approximation to A, making it one of the most broadly applicable tools in the subject.
Each of these decompositions has its own page with full derivations and worked examples.