Linear Algebra Formulas

257 formulas

RepresentationGo to

OperationsGo to

Norm & DistanceGo to

Dot ProductGo to

Cross ProductGo to

Triple ProductsGo to

Vector SpaceGo to

SubspacesGo to

SpanGo to

Linear IndependenceGo to

Basis & CoordinatesGo to

Dimension & RankGo to

Transpose & SymmetryGo to

Special Matrix TypesGo to

InverseGo to

RankGo to

TraceGo to

DefinitionsGo to

Cofactor StructureGo to

Cofactor ExpansionGo to

Row Operation EffectsGo to

Algebraic PropertiesGo to

Special DeterminantsGo to

Adjugate & InverseGo to

Linear SystemsGo to

Geometric InterpretationGo to

Eigenvalue ConnectionGo to

Standard FormsGo to

Echelon FormsGo to

Elementary Row OperationsGo to

SolvabilityGo to

Homogeneous SystemsGo to

Definition & PropertiesGo to

Matrix RepresentationGo to

Image & KernelGo to

Similarity & Basis ChangeGo to

Geometric TransformationsGo to

FoundationGo to

Characteristic PolynomialGo to

MultiplicitiesGo to

Eigenvalue AlgebraGo to

DiagonalizabilityGo to

Special SpectraGo to

SpectralGo to

ApplicationsGo to

ComplexGo to

Inner ProductGo to

Orthogonal ComplementGo to

Orthogonal SetsGo to

ProjectionGo to

Gram-SchmidtGo to

Least SquaresGo to

LUGo to

CholeskyGo to

QRGo to

SVDGo to

Cross-DecompositionGo to

Representation

(4 formulas)

Vector Component Form

\mathbf{v} = (v_1, v_2, \ldots, v_n) \in \mathbb{R}^n

What Is a Vector

See details

explanationnotationrelated formulasrelated definitions

n

-dimensional vector is specified as an ordered tuple of

n

real numbers. Each entry

v_i

is a component, encoding signed displacement along the

i

-th coordinate axis. The component form bridges the geometric picture (an arrow with magnitude and direction) and the algebraic object on which all subsequent operations act.

Standard Basis Decomposition

\mathbf{v} = v_1 \mathbf{e}_1 + v_2 \mathbf{e}_2 + \cdots + v_n \mathbf{e}_n

Notation and Representation

See details

explanationnotationrelated formulasrelated definitions

Every vector in

\mathbb{R}^n

is the linear combination of the standard basis vectors

\mathbf{e}_i

, with the components

v_i

themselves serving as the coefficients. The decomposition makes the link between coordinates and basis explicit — coordinates are the weights in the expansion against a fixed reference frame.

Direction Cosines

\cos\alpha = \frac{v_1}{\|\mathbf{v}\|}, \quad \cos\beta = \frac{v_2}{\|\mathbf{v}\|}, \quad \cos\gamma = \frac{v_3}{\|\mathbf{v}\|}

Direction Cosines

See details

explanationconditionsrelated formulasrelated definitions

The direction cosines are the cosines of the angles a nonzero vector makes with the three positive coordinate axes in

\mathbb{R}^3

. They package directional information into three scalars and depend only on the unit vector

\hat{\mathbf{v}} = \mathbf{v}/\|\mathbf{v}\|

, not on the length of

\mathbf{v}

Direction Cosines Identity

\cos^2\alpha + \cos^2\beta + \cos^2\gamma = 1

Direction Cosines Identity

See details

explanationderivationrelated formulasrelated definitions

The three direction cosines of a vector in

\mathbb{R}^3

are not independent — their squares sum to

1

. This identity follows directly from

\|\hat{\mathbf{v}}\|^2 = 1

when

\hat{\mathbf{v}}

is expressed in components.

Operations

(15 formulas)

Vector Addition

\mathbf{a} + \mathbf{b} = (a_1 + b_1,\ a_2 + b_2,\ \ldots,\ a_n + b_n)

Vector Addition

See details

explanationconditionsrelated formulasrelated definitions

Addition pairs corresponding components and sums them. The result is again a vector in the same

\mathbb{R}^n

. Geometrically, the tip-to-tail or parallelogram constructions both produce the same sum.

Vector Subtraction

\mathbf{a} - \mathbf{b} = \mathbf{a} + (-\mathbf{b}) = (a_1 - b_1,\ a_2 - b_2,\ \ldots,\ a_n - b_n)

Vector Subtraction

See details

explanationrelated formulasrelated definitions

Subtraction is addition combined with negation. Geometrically, with

\mathbf{a}

and

\mathbf{b}

drawn from a common tail, the difference

\mathbf{a} - \mathbf{b}

runs from the tip of

\mathbf{b}

to the tip of

\mathbf{a}

. This connects subtraction directly to distance:

d(\mathbf{a}, \mathbf{b}) = \|\mathbf{a} - \mathbf{b}\|

Scalar Multiplication of Vectors

c\mathbf{a} = (ca_1,\ ca_2,\ \ldots,\ ca_n)

Scalar Multiplication

See details

explanationrelated formulasrelated definitions

Multiplying a vector by a scalar

c

scales every component by

c

. Geometrically, the result has length

|c|\,\|\mathbf{a}\|

and points in the same direction as

\mathbf{a}

when

c > 0

, the opposite direction when

c < 0

, and collapses to

\mathbf{0}

when

c = 0

Linear Combination

c_1 \mathbf{v}_1 + c_2 \mathbf{v}_2 + \cdots + c_k \mathbf{v}_k

Linear Combinations

See details

explanationrelated formulasrelated definitions

A weighted sum of vectors, with scalar weights

c_i \in \mathbb{R}

. Vector addition and scalar multiplication are both special cases. Asking whether a vector

\mathbf{b}

is a linear combination of

\mathbf{v}_1, \ldots, \mathbf{v}_k

is equivalent to asking whether the system

A\mathbf{x} = \mathbf{b}

has a solution, where

A

has the

\mathbf{v}_i

as columns.

Span

\text{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} = \{c_1 \mathbf{v}_1 + \cdots + c_k \mathbf{v}_k \mid c_i \in \mathbb{R}\}

Span

See details

explanationrelated formulasrelated definitions

The span is the set of every linear combination of the given vectors. It is always a subspace of

\mathbb{R}^n

containing the zero vector. The geometric shape reflects the input — a line for one nonzero vector, a plane for two non-parallel vectors, all of

\mathbb{R}^n

for

n

linearly independent vectors.

Matrix Equality

A = B \iff a_{ij} = b_{ij} \text{ for all } i, j

Matrix Equality and the Zero Matrix

See details

explanationconditionsrelated formulasrelated definitions

Two matrices are equal precisely when they share the same dimensions and every corresponding entry matches. A single mismatched entry breaks equality. Matrices of different shapes are never equal regardless of contents — a

2 \times 3

matrix cannot equal a

3 \times 2

matrix.

Matrix Addition

(A + B)_{ij} = a_{ij} + b_{ij}

Matrix Addition

See details

explanationconditionsvariantsrelated formulasrelated definitions

Matrix addition is performed entry by entry. The sum has the same shape as the operands. Equipped with addition and scalar multiplication, the set of

m \times n

matrices forms a vector space of dimension

mn

Matrix Subtraction

(A - B)_{ij} = a_{ij} - b_{ij}

Matrix Subtraction

See details

explanationconditionsrelated formulasrelated definitions

Subtraction is defined as addition of the additive inverse:

A - B = A + (-B)

. Entry by entry, this corresponds to subtracting matching components.

Scalar Multiplication of Matrices

(cA)_{ij} = c \cdot a_{ij}

Scalar Multiplication

See details

explanationvariantsrelated formulasrelated definitions

Multiplying a matrix by a scalar scales every entry by that scalar. Together with addition, this operation gives the matrix space its vector space structure.

Matrix Multiplication

(AB)_{ij} = \sum_{k=1}^{n} a_{ik}\, b_{kj}

Matrix Multiplication — Definition

See details

explanationconditionsvariantsrelated formulasrelated definitions

The

(i,j)

entry of the product is the dot product of row

i

A

with column

j

B

. The number of columns of

A

must equal the number of rows of

B

, and the result has dimensions

m \times p

Matrix-Vector Product (Column Form)

A\mathbf{x} = x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_n

Matrices as Collections of Vectors

See details

explanationnotationrelated formulasrelated definitions

The product

A\mathbf{x}

is a linear combination of the columns of

A

, weighted by the entries of

\mathbf{x}

. This single observation underlies the theory of linear systems, transformations, and the column space.

Matrix Multiplication Associativity

(AB)C = A(BC)

Matrix Multiplication — Properties

See details

explanationconditionsrelated formulasrelated definitions

Matrix multiplication is associative whenever both products are defined. The grouping of three or more factors does not affect the result, allowing chains

ABCD

to be written without parentheses.

Matrix Multiplication Distributivity

A(B + C) = AB + AC, \qquad (A + B)C = AC + BC

Matrix Multiplication — Properties

See details

explanationconditionsvariantsrelated formulasrelated definitions

Matrix multiplication distributes over matrix addition from both sides. Because multiplication is non-commutative, the left- and right-distributive laws are stated separately, but both hold whenever dimensions align.

Matrix Multiplication Non-Commutativity

AB \neq BA \quad \text{in general}

Matrix Multiplication — Properties

See details

explanationderivationvariantsrelated formulasrelated definitions

Matrix multiplication does not generally commute, even when both products are defined and have matching dimensions. This asymmetry distinguishes matrix algebra from scalar arithmetic and has far-reaching consequences for inverses, powers, and transformations.

Matrix Power

A^0 = I, \qquad A^k = \underbrace{A \cdot A \cdots A}_{k \text{ factors}}, \qquad A^{-k} = (A^{-1})^k

Matrix Powers

See details

explanationconditionsvariantsrelated formulasrelated definitions

Powers are defined for square matrices through repeated multiplication, with

A^0 = I

by convention. Negative powers exist only when

A

is invertible. The standard exponent laws

A^j A^k = A^{j+k}

and

(A^j)^k = A^{jk}

hold for all integers when

A

is invertible, and for non-negative integers in general.

Norm & Distance

(6 formulas)

Euclidean Norm

\|\mathbf{v}\| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2} = \sqrt{\sum_{i=1}^{n} v_i^2}

The General Norm

See details

explanationderivationrelated formulasrelated definitions

The norm assigns a single non-negative real number — the length — to every vector in

\mathbb{R}^n

. In

\mathbb{R}^2

and

\mathbb{R}^3

this matches the geometric length via the Pythagorean theorem; the formula extends the concept algebraically to any dimension.

Distance Formula

d(\mathbf{a}, \mathbf{b}) = \|\mathbf{a} - \mathbf{b}\| = \sqrt{(a_1 - b_1)^2 + (a_2 - b_2)^2 + \cdots + (a_n - b_n)^2}

Distance Between Vectors

See details

explanationrelated formulasrelated definitions

The Euclidean distance between two vectors is the norm of their difference. For position vectors, this is the straight-line distance between the points they identify. The distance is symmetric, non-negative, and zero only when

\mathbf{a} = \mathbf{b}

Vector Normalization

\hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|}

Normalization

See details

explanationconditionsderivationrelated formulasrelated definitions

Dividing a nonzero vector by its norm produces the unit vector pointing in the same direction. Normalization extracts pure direction by stripping out length information.

Norm Scaling Property

\|c\mathbf{v}\| = |c|\,\|\mathbf{v}\|

Properties of the Norm

See details

explanationderivationrelated formulasrelated definitions

Multiplying a vector by a scalar

c

multiplies its norm by

|c|

. The absolute value is required because a negative scalar reverses direction without producing a negative length.

Triangle Inequality

\|\mathbf{a} + \mathbf{b}\| \leq \|\mathbf{a}\| + \|\mathbf{b}\|

Properties of the Norm

See details

explanationconditionsderivationrelated formulasrelated definitions

The length of a sum never exceeds the sum of the lengths. Geometrically, the direct path from start to finish in the tip-to-tail construction is no longer than the path that traces both vectors end to end.

Cauchy-Schwarz Inequality

|\mathbf{a} \cdot \mathbf{b}| \leq \|\mathbf{a}\|\,\|\mathbf{b}\|

The Cauchy–Schwarz Inequality

See details

explanationconditionsderivationrelated formulasrelated definitions

The absolute value of the dot product is bounded by the product of the norms. This bound makes the angle formula well-posed — the ratio

\frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\|\,\|\mathbf{b}\|}

always lies in

[-1, 1]

, where

\arccos

is defined.

Dot Product

(8 formulas)

Dot Product (Algebraic)

\mathbf{a} \cdot \mathbf{b} = a_1 b_1 + a_2 b_2 + \cdots + a_n b_n = \sum_{i=1}^{n} a_i b_i

Algebraic Definition

See details

explanationconditionsrelated formulasrelated definitions

The dot product collapses two vectors into a single scalar by multiplying corresponding components and summing the results. Unlike addition or scalar multiplication, the output is not a vector — it is a number that measures how the two vectors relate.

Dot Product (Geometric)

\mathbf{a} \cdot \mathbf{b} = \|\mathbf{a}\|\,\|\mathbf{b}\|\cos\theta

Geometric Definition

See details

explanationconditionsderivationrelated formulasrelated definitions

The dot product equals the product of the norms scaled by the cosine of the angle between the vectors. This form reveals what the dot product measures — directional alignment. It is positive for acute angles, zero for perpendicular vectors, and negative for obtuse angles.

Angle Between Vectors

\cos\theta = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\|\,\|\mathbf{b}\|}

The Angle Between Vectors

See details

explanationconditionsrelated formulasrelated definitions

Solving the geometric form of the dot product for

\cos\theta

extracts the angle between two vectors from their components. The right side equals

\hat{\mathbf{a}} \cdot \hat{\mathbf{b}}

— the dot product of the normalized vectors — confirming that the angle depends only on direction, not magnitude.

Self Dot Product

\mathbf{v} \cdot \mathbf{v} = v_1^2 + v_2^2 + \cdots + v_n^2 = \|\mathbf{v}\|^2

Properties of the Dot Product

See details

explanationrelated formulasrelated definitions

The dot product of a vector with itself equals the square of its norm. This identity ties the algebraic operation directly to length — squared length is not a separate concept but a special case of the dot product. It also gives an alternative norm formula:

\|\mathbf{v}\| = \sqrt{\mathbf{v} \cdot \mathbf{v}}

Orthogonality Condition

\mathbf{a} \cdot \mathbf{b} = 0 \iff \mathbf{a} \perp \mathbf{b}

Orthogonality

See details

explanationconditionsrelated formulasrelated definitions

Two vectors are orthogonal precisely when their dot product is zero. In

\mathbb{R}^2

and

\mathbb{R}^3

this matches geometric perpendicularity; in higher dimensions, the algebraic condition serves as the definition. The zero vector is orthogonal to every vector by convention.

Scalar Projection

\text{comp}_{\mathbf{b}}\,\mathbf{a} = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{b}\|}

Orthogonal Projection

See details

explanationconditionsrelated formulasrelated definitions

The signed length of the projection of

\mathbf{a}

onto the direction of

\mathbf{b}

. Positive when

\mathbf{a}

leans toward

\mathbf{b}

, negative when away, zero when orthogonal. The result is a scalar carrying signed-distance information along the direction of

\mathbf{b}

Vector Projection

\text{proj}_{\mathbf{b}}\,\mathbf{a} = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{b}\|^2}\,\mathbf{b} = \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{b} \cdot \mathbf{b}}\,\mathbf{b}

Orthogonal Projection

See details

explanationconditionsrelated formulasrelated definitions

The component of

\mathbf{a}

in the direction of

\mathbf{b}

, expressed as an actual vector parallel to

\mathbf{b}

. The scalar projection rescaled by

\mathbf{b}/\|\mathbf{b}\|

to produce a vector with the appropriate length and orientation.

Orthogonal Decomposition

\mathbf{a} = \text{proj}_{\mathbf{b}}\,\mathbf{a} + \mathbf{a}_{\perp}, \quad \mathbf{a}_{\perp} = \mathbf{a} - \text{proj}_{\mathbf{b}}\,\mathbf{a}

Orthogonal Projection

See details

explanationconditionsderivationrelated formulasrelated definitions

Every vector

\mathbf{a}

splits uniquely into a component along

\mathbf{b}

and a component perpendicular to

\mathbf{b}

. The perpendicular part satisfies

\mathbf{a}_{\perp} \cdot \mathbf{b} = 0

. This decomposition underlies Gram–Schmidt orthogonalization and least-squares fitting.

Cross Product

(8 formulas)

Cross Product (Component Form)

\mathbf{a} \times \mathbf{b} = (a_2 b_3 - a_3 b_2,\ a_3 b_1 - a_1 b_3,\ a_1 b_2 - a_2 b_1)

Algebraic Definition

See details

explanationconditionsrelated formulasrelated definitions

The cross product takes two vectors in

\mathbb{R}^3

and returns a third vector built from cyclic differences of pairwise products. Unlike the dot product, the output is a vector — perpendicular to both inputs — and the operation is defined exclusively in three dimensions.

Cross Product (Determinant Form)

\mathbf{a} \times \mathbf{b} = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{vmatrix}

Algebraic Definition

See details

explanationconditionsrelated formulasrelated definitions

A symbolic

3 \times 3

determinant with the standard basis vectors in the first row. Cofactor expansion along that row reproduces the component formula term by term. Placing vectors in the top row makes this a notational device rather than a true determinant, but it is the standard mnemonic for organizing the six products and their signs.

Cross Product Magnitude

\|\mathbf{a} \times \mathbf{b}\| = \|\mathbf{a}\|\,\|\mathbf{b}\|\sin\theta

Geometric Interpretation

See details

explanationconditionsrelated formulasrelated definitions

The length of

\mathbf{a} \times \mathbf{b}

equals the area of the parallelogram spanned by

\mathbf{a}

and

\mathbf{b}

. The cross product peaks at

\theta = \pi/2

(rectangle, maximum area) and vanishes at

\theta = 0

\pi

(parallel vectors, no area). This complements the dot product, which involves

\cos\theta

rather than

\sin\theta

Standard Basis Cross Products

\mathbf{i} \times \mathbf{j} = \mathbf{k}, \quad \mathbf{j} \times \mathbf{k} = \mathbf{i}, \quad \mathbf{k} \times \mathbf{i} = \mathbf{j}

Standard Basis Cross Products

See details

explanationvariantsrelated formulasrelated definitions

The cross products of the standard basis vectors follow a cyclic loop

\mathbf{i} \to \mathbf{j} \to \mathbf{k} \to \mathbf{i}

. Going forward around the loop yields a positive basis vector; reversing the order negates the result. Any cross product in

\mathbb{R}^3

reduces to these nine cases via distributivity.

Cross Product Anti-Commutativity

\mathbf{a} \times \mathbf{b} = -(\mathbf{b} \times \mathbf{a})

Properties of the Cross Product

See details

explanationrelated formulasrelated definitions

Swapping the operands reverses the output. In the right-hand rule, exchanging which vector the fingers follow and which they curl toward sends the thumb the other way. This contrasts sharply with the dot product, where order does not matter.

Parallelism Test (Cross Product)

\mathbf{a} \times \mathbf{b} = \mathbf{0} \iff \mathbf{a} \parallel \mathbf{b}

Parallel Vectors and the Cross Product

See details

explanationconditionsrelated formulasrelated definitions

Two vectors in

\mathbb{R}^3

are parallel exactly when their cross product is the zero vector. The geometric reason: parallel vectors form an angle of

0

\pi

, making

\sin\theta = 0

and collapsing the cross product magnitude. The zero vector is parallel to every vector by convention.

Vector Triple Product

\mathbf{a} \times (\mathbf{b} \times \mathbf{c}) = (\mathbf{a} \cdot \mathbf{c})\,\mathbf{b} - (\mathbf{a} \cdot \mathbf{b})\,\mathbf{c}

Properties of the Cross Product

See details

explanationvariantsrelated formulasrelated definitions

The triple cross product reduces to a linear combination of the inner two vectors, with coefficients given by dot products. The mnemonic "BAC minus CAB" captures the order: the middle vector

\mathbf{b}

comes first, scaled by

\mathbf{a} \cdot \mathbf{c}

. Useful for simplifying nested cross products without expanding components.

Lagrange Identity

(\mathbf{a} \times \mathbf{b}) \cdot (\mathbf{c} \times \mathbf{d}) = (\mathbf{a} \cdot \mathbf{c})(\mathbf{b} \cdot \mathbf{d}) - (\mathbf{a} \cdot \mathbf{d})(\mathbf{b} \cdot \mathbf{c})

Properties of the Cross Product

See details

explanationrelated formulasrelated definitions

The dot product of two cross products expands into a

2 \times 2

determinant of dot products. Setting

\mathbf{c} = \mathbf{a}

and

\mathbf{d} = \mathbf{b}

recovers the magnitude identity

\|\mathbf{a} \times \mathbf{b}\|^2 = \|\mathbf{a}\|^2 \|\mathbf{b}\|^2 - (\mathbf{a} \cdot \mathbf{b})^2

, which is equivalent to

\sin^2\theta = 1 - \cos^2\theta

Triple Products

(4 formulas)

Scalar Triple Product

\mathbf{a} \cdot (\mathbf{b} \times \mathbf{c}) = \begin{vmatrix} a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \\ c_1 & c_2 & c_3 \end{vmatrix}

The Scalar Triple Product

See details

explanationconditionsrelated formulasrelated definitions

Three vectors in

\mathbb{R}^3

combine into a single scalar via a cross product nested inside a dot product. The result is the signed volume of the parallelepiped with the three vectors as edges. The determinant arrangement of the nine components computes the same number.

Parallelogram Area

\text{Area} = \|\mathbf{a} \times \mathbf{b}\|

Geometric Interpretation

See details

explanationvariantsrelated formulasrelated definitions

The magnitude of the cross product equals the area of the parallelogram with adjacent sides

\mathbf{a}

and

\mathbf{b}

. From the magnitude formula

\|\mathbf{a}\|\,\|\mathbf{b}\|\sin\theta

— the standard base-times-height formula for parallelogram area, where

\|\mathbf{b}\|\sin\theta

is the height perpendicular to

\mathbf{a}

Parallelepiped Volume

V = |\mathbf{a} \cdot (\mathbf{b} \times \mathbf{c})|

The Scalar Triple Product

See details

explanationconditionsrelated formulasrelated definitions

The absolute value of the scalar triple product equals the volume of the parallelepiped with edges

\mathbf{a}

\mathbf{b}

\mathbf{c}

. The signed version of the triple product encodes orientation — positive for right-handed triples, negative for left-handed. Taking the absolute value strips that information to leave the geometric volume.

Pyramid Volume

V = \tfrac{1}{6}|\mathbf{a} \cdot (\mathbf{b} \times \mathbf{c})|

The Scalar Triple Product

See details

explanationrelated formulasrelated definitions

The volume of the tetrahedron (triangular pyramid) with edges

\mathbf{a}

\mathbf{b}

\mathbf{c}

from a common vertex is one-sixth of the corresponding parallelepiped volume. The factor

\tfrac{1}{6}

comes from

\tfrac{1}{3}

(general pyramid:

V = \tfrac{1}{3} \cdot \text{base} \cdot \text{height}

) times

\tfrac{1}{2}

(the triangular base is half the parallelogram base).

Vector Space

(3 formulas)

Vector Space Axioms

$$\begin{aligned}
&\text{For all } \mathbf{u}, \mathbf{v}, \mathbf{w} \in V \text{ and all } c, d \in \mathbb{F}: \\
&(1)\ \mathbf{u} + \mathbf{v} \in V \quad (2)\ \mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u} \\
&(3)\ (\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w}) \\
&(4)\ \exists\, \mathbf{0} \in V: \mathbf{v} + \mathbf{0} = \mathbf{v} \\
&(5)\ \exists\, -\mathbf{v} \in V: \mathbf{v} + (-\mathbf{v}) = \mathbf{0} \\
&(6)\ c\mathbf{v} \in V \quad (7)\ c(d\mathbf{v}) = (cd)\mathbf{v} \\
&(8)\ c(\mathbf{u} + \mathbf{v}) = c\mathbf{u} + c\mathbf{v} \\
&(9)\ (c + d)\mathbf{v} = c\mathbf{v} + d\mathbf{v} \quad (10)\ 1\mathbf{v} = \mathbf{v}
\end{aligned}$$

The Ten Axioms

See details

explanationconditionsrelated formulasrelated definitions

A vector space over a field

\mathbb{F}

is a set

V

with two operations — vector addition and scalar multiplication — satisfying ten axioms. Five govern addition (closure, commutativity, associativity, zero, inverse), and five govern scalar multiplication (closure, scalar associativity, two distributive laws, multiplicative identity). Any structure satisfying all ten inherits the entire theory of linear algebra; any structure violating even one is disqualified.

Scalar-Zero Property

0\mathbf{v} = \mathbf{0}, \quad c\mathbf{0} = \mathbf{0}

Immediate Consequences of the Axioms

See details

explanationderivationrelated formulasrelated definitions

The scalar zero annihilates every vector, and any scalar applied to the zero vector returns the zero vector. Both facts are theorems derived from the axioms — not separate assumptions. They establish that the two distinct "zeros" (scalar

0

in the field, vector

\mathbf{0}

V

) interact through scalar multiplication in the expected way.

Negative One Scalar

(-1)\mathbf{v} = -\mathbf{v}

Immediate Consequences of the Axioms

See details

explanationderivationrelated formulasrelated definitions

Scaling a vector by

-1

produces its additive inverse. This identifies the additive inverse

-\mathbf{v}

(introduced by axiom 5) with a specific scalar product, removing any ambiguity about what negation means in a vector space.

Subspaces

(2 formulas)

Subspace Test

W \subseteq V \text{ is a subspace} \iff \begin{cases} W \neq \emptyset \\ \mathbf{u}, \mathbf{v} \in W \Rightarrow \mathbf{u} + \mathbf{v} \in W \\ \mathbf{v} \in W,\ c \in \mathbb{F} \Rightarrow c\mathbf{v} \in W \end{cases}

The Subspace Test

See details

explanationconditionsrelated formulasrelated definitions

A nonempty subset of a vector space is a subspace exactly when it is closed under addition and under scalar multiplication. The other axioms (commutativity, associativity, distributivity) hold automatically because vectors in

W

are vectors in

V

. Closure is the only thing that can fail.

Subspace Test Combined

W \subseteq V \text{ is a subspace} \iff W \neq \emptyset \text{ and } c\mathbf{u} + d\mathbf{v} \in W \text{ for all } \mathbf{u}, \mathbf{v} \in W,\ c, d \in \mathbb{F}

The Subspace Test

See details

explanationrelated formulasrelated definitions

The two closure conditions can be compressed into a single closure-under-linear-combinations condition. Setting

d = 0

recovers scalar closure; setting

c = d = 1

recovers additive closure. The combined form is more efficient in proofs and in algorithmic verification.

Span

(3 formulas)

Span (Set Definition)

\text{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} = \left\{c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k \mid c_i \in \mathbb{F}\right\}

Definition

See details

explanationrelated formulasrelated definitions

The span of a finite set of vectors is the set of all linear combinations of those vectors. As the scalar coefficients range over

\mathbb{F}

, the span sweeps out an entire subspace. By convention,

\text{Span}\,\emptyset = \{\mathbf{0}\}

Span Membership Criterion

\mathbf{b} \in \text{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} \iff A\mathbf{c} = \mathbf{b} \text{ is consistent}

Testing Whether a Vector Is in a Span

See details

explanationnotationrelated formulasrelated definitions

Testing whether

\mathbf{b}

lies in the span of given vectors reduces to a linear-system solvability question. Arrange the spanning vectors as columns of a matrix

A

; then

\mathbf{b}

is in the span iff the system

A\mathbf{c} = \mathbf{b}

has at least one solution. Row-reducing the augmented matrix

[A \mid \mathbf{b}]

decides the question: a contradiction row

[0\,\cdots\,0 \mid d \neq 0]

means

\mathbf{b} \notin \text{Span}

, no contradiction means

\mathbf{b} \in \text{Span}

Span Is Smallest Subspace

\text{Span}(K) = \bigcap_{\substack{W \text{ subspace} \\ K \subseteq W}} W

Definition

See details

explanationconditionsrelated formulasrelated definitions

The span of a set

K

is the smallest subspace containing

K

. Equivalently, it is the intersection of all subspaces that contain

K

— a subspace itself, since intersections of subspaces are subspaces. Any subspace containing

K

must also contain every linear combination of vectors in

K

, so

\text{Span}(K)

is contained in every such subspace, making it the minimal one.

Linear Independence

(5 formulas)

Linear Independence Equation

\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} \text{ is independent} \iff \bigl(c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0} \Rightarrow c_1 = \cdots = c_k = 0\bigr)

Definition

See details

explanationconditionsrelated formulasrelated definitions

A set is linearly independent when the only linear combination producing the zero vector is the trivial one — every coefficient must be zero. Any nontrivial relation (some coefficient nonzero) means at least one vector is redundant: it can be expressed as a combination of the others.

Linear Independence Matrix Test

\{\mathbf{v}_1, \ldots, \mathbf{v}_k\} \subset \mathbb{R}^m \text{ is independent} \iff A\mathbf{c} = \mathbf{0} \text{ has only the trivial solution}

Testing Independence: The Homogeneous System

See details

explanationconditionsrelated formulasrelated definitions

For column vectors in

\mathbb{R}^m

, independence is equivalent to triviality of the homogeneous system whose coefficient matrix has those vectors as columns. Row-reduce

A = [\mathbf{v}_1\ \cdots\ \mathbf{v}_k]

: independence holds iff every column has a pivot (no free variables). If any column is free, a nontrivial null-space vector gives a dependence relation.

Linear Independence Determinant Test

\{\mathbf{v}_1, \ldots, \mathbf{v}_n\} \subset \mathbb{R}^n \text{ is independent} \iff \det[\mathbf{v}_1\ \cdots\ \mathbf{v}_n] \neq 0

Testing Independence: The Determinant

See details

explanationconditionsrelated formulasrelated definitions

When the number of vectors equals the dimension of the ambient space, independence reduces to a single number — the determinant of the matrix whose columns are the vectors. Nonzero determinant means independence; zero determinant means dependence. This follows from the invertible matrix theorem:

A

is invertible iff its columns are independent iff

\det(A) \neq 0

Max Independent Set Size

|S| > \dim V \Rightarrow S \text{ is dependent}

Dimension and Independence

See details

explanationconditionsrelated formulasrelated definitions

In an

n

-dimensional vector space, no independent set can have more than

n

elements. Any collection of

n+1

or more vectors is automatically dependent — independence imposes a hard ceiling at the dimension. Conversely, any independent set with exactly

\dim V

elements is automatically a basis: the spanning condition comes for free at the magic count.

Wronskian Test

W(f_1, \ldots, f_n)(x) = \det\begin{pmatrix} f_1(x) & \cdots & f_n(x) \\ f_1'(x) & \cdots & f_n'(x) \\ \vdots & \ddots & \vdots \\ f_1^{(n-1)}(x) & \cdots & f_n^{(n-1)}(x) \end{pmatrix}

W(f_1, \ldots, f_n)(x) = \det\begin{pmatrix} f_1(x) & \cdots & f_n(x) \\ f_1'(x) & \cdots & f_n'(x) \\ \vdots & \ddots & \vdots \\ f_1^{(n-1)}(x) & \cdots & f_n^{(n-1)}(x) \end{pmatrix}

The Wronskian Test for Functions

See details

explanationconditionsrelated formulasrelated definitions

The Wronskian is a determinant built from successive derivatives of

n

functions, each differentiable at least

n-1

times. If

W(f_1, \ldots, f_n)(x_0) \neq 0

at any single point

x_0

, the functions are linearly independent on the entire interval. The Wronskian provides a usable independence test in function spaces, where the column-matrix approach does not apply directly.

Basis & Coordinates

(7 formulas)

Basis Definition

\mathcal{B} = \{\mathbf{v}_1, \ldots, \mathbf{v}_n\} \text{ is a basis for } V \iff \begin{cases} \mathcal{B} \text{ is linearly independent} \\ \text{Span}(\mathcal{B}) = V \end{cases}

Basis: Definition

See details

explanationrelated formulasrelated definitions

A basis is a set that is independent and spans the space — the two conditions hold simultaneously. Independence guarantees no vector is wasted; spanning guarantees no vector in

V

is unreachable. Equivalently, a basis is a maximal independent set, or a minimal spanning set.

Unique Basis Representation

\forall\, \mathbf{v} \in V,\ \exists!\, (c_1, \ldots, c_n): \mathbf{v} = c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_n\mathbf{v}_n

Unique Representation

See details

explanationderivationrelated formulasrelated definitions

Every vector in

V

admits exactly one representation as a linear combination of basis vectors. Existence follows from the spanning condition, uniqueness from independence: if two coefficient sets both produced

\mathbf{v}

, their difference would give a nontrivial relation equal to

\mathbf{0}

. This unique representation is what makes coordinates well-defined.

Coordinate Vector

[\mathbf{v}]_\mathcal{B} = \begin{pmatrix} c_1 \\ c_2 \\ \vdots \\ c_n \end{pmatrix} \quad \text{where } \mathbf{v} = c_1\mathbf{v}_1 + \cdots + c_n\mathbf{v}_n

Coordinates

See details

explanationconditionsrelated formulasrelated definitions

The coordinate vector of

\mathbf{v}

relative to basis

\mathcal{B}

packages the unique expansion coefficients into a column in

\mathbb{F}^n

. Coordinates depend on the choice of basis — the same vector has different coordinates in different bases. For the standard basis of

\mathbb{R}^n

, coordinates coincide with components.

Standard Basis (Rn)

\mathbf{e}_i = (0, \ldots, 0, \underset{i\text{-th}}{1}, 0, \ldots, 0), \quad i = 1, \ldots, n

The Standard Basis for Rⁿ

See details

explanationnotationrelated formulasrelated definitions

The standard basis of

\mathbb{R}^n

consists of

n

vectors, each with a

1

in one position and zeros elsewhere. Independence is immediate: no vector is a combination of the others. Spanning follows from

(v_1, \ldots, v_n) = v_1\mathbf{e}_1 + \cdots + v_n\mathbf{e}_n

. Coordinates relative to this basis coincide with the components themselves — the reason it is the default choice.

Change of Basis Formula

[\mathbf{v}]_\mathcal{C} = P_{\mathcal{C} \leftarrow \mathcal{B}}\, [\mathbf{v}]_\mathcal{B}

Change of Basis

See details

explanationnotationrelated formulasrelated definitions

Coordinates of the same vector in two different bases are related by left-multiplication by the change-of-basis matrix. The columns of

P_{\mathcal{C} \leftarrow \mathcal{B}}

are the

\mathcal{C}

-coordinate vectors of each

\mathcal{B}

-basis vector. Knowing

[\mathbf{v}]_\mathcal{B}

, this matrix produces

[\mathbf{v}]_\mathcal{C}

in one multiplication.

Change of Basis Inverse

P_{\mathcal{B} \leftarrow \mathcal{C}} = \bigl(P_{\mathcal{C} \leftarrow \mathcal{B}}\bigr)^{-1}

Change of Basis

See details

explanationrelated formulasrelated definitions

Reversing the direction of a basis change inverts the matrix. The two change-of-basis matrices in opposite directions are inverses of each other, so once one is computed, the other is obtained by matrix inversion. Both matrices are invertible because basis vectors are linearly independent.

Coordinate Map Linearity

[\mathbf{u} + \mathbf{v}]_\mathcal{B} = [\mathbf{u}]_\mathcal{B} + [\mathbf{v}]_\mathcal{B}, \qquad [c\mathbf{v}]_\mathcal{B} = c\,[\mathbf{v}]_\mathcal{B}

Coordinates and Isomorphism

See details

explanationrelated formulasrelated definitions

The coordinate map

\mathbf{v} \mapsto [\mathbf{v}]_\mathcal{B}

preserves both vector operations. Together with bijectivity, this makes it an isomorphism

V \to \mathbb{F}^n

. Every

n

-dimensional vector space over

\mathbb{F}

is therefore structurally identical to

\mathbb{F}^n

— polynomials, matrices, and ODE solution spaces all reduce to coordinate computations once a basis is fixed.

Dimension & Rank

(8 formulas)

Dimension Definition

\dim(V) = |\mathcal{B}| \quad \text{for any basis } \mathcal{B} \text{ of } V

Definition

See details

explanationconditionsrelated formulasrelated definitions

The dimension of

V

is the number of vectors in any basis. The basis size theorem guarantees that all bases have the same cardinality, so the count is intrinsic to

V

, not an artifact of basis choice. By convention,

\dim\{\mathbf{0}\} = 0

(the empty set is a basis for the zero space).

Subspace Dimension Inequality

W \subseteq V \Rightarrow \dim(W) \leq \dim(V), \quad \text{with equality} \iff W = V

Dimension of Subspaces

See details

explanationconditionsrelated formulasrelated definitions

A subspace cannot exceed its parent space in dimension, and matches it only when the subspace is the whole space. Any basis of

W

is an independent set in

V

, hence has size at most

\dim V

. Equality means the basis of

W

already spans

V

, forcing

W = V

Dimension Sum Formula

\dim(W_1 + W_2) = \dim(W_1) + \dim(W_2) - \dim(W_1 \cap W_2)

The Dimension Formula for Subspace Sums

See details

explanationderivationrelated formulasrelated definitions

The dimension of a sum of subspaces is the sum of their dimensions minus the dimension of their intersection — the linear-algebra analogue of inclusion-exclusion for set sizes. Vectors in

W_1 \cap W_2

are counted once in each summand, so the intersection is subtracted to avoid double-counting.

Direct Sum Criterion

V = W_1 \oplus W_2 \iff V = W_1 + W_2 \text{ and } W_1 \cap W_2 = \{\mathbf{0}\}

The Dimension Formula for Subspace Sums

See details

explanationvariantsrelated formulasrelated definitions

A sum of two subspaces is direct exactly when the intersection is trivial. The trivial-intersection condition is what guarantees uniqueness of the decomposition

\mathbf{v} = \mathbf{w}_1 + \mathbf{w}_2

: any two decompositions would differ by a nonzero vector lying in both

W_1

and

W_2

, contradicting

W_1 \cap W_2 = \{\mathbf{0}\}

Direct Sum Dimension

V = W_1 \oplus W_2 \Rightarrow \dim(V) = \dim(W_1) + \dim(W_2)

The Dimension Formula for Subspace Sums

See details

explanationderivationrelated formulasrelated definitions

For a direct sum, dimensions add cleanly. The dimension formula for sums reduces to plain addition because the intersection has dimension zero. Conversely, if

V = W_1 + W_2

and

\dim V = \dim W_1 + \dim W_2

, the sum is automatically direct.

Rank-Nullity Theorem (Matrix Form)

\dim(\text{Col}\,A) + \dim(\text{Null}\,A) = n

The Rank-Nullity Theorem as a Dimension Statement

See details

explanationnotationvariantsrelated formulasrelated definitions

For an

m \times n

matrix

A

, the dimensions of the column space and the null space sum to

n

— the dimension of the domain

\mathbb{R}^n

. The

n

degrees of freedom in the input split between what survives the map (column space) and what is annihilated (null space). No dimensions are created or destroyed.

Four Fundamental Subspaces Dimensions

\begin{aligned} \dim(\text{Col}\,A) &= r & \dim(\text{Row}\,A) &= r \\ \dim(\text{Null}\,A) &= n - r & \dim(\text{Null}\,A^T) &= m - r \end{aligned}

Dimension Accounting

See details

explanationconditionsrelated formulasrelated definitions

For an

m \times n

matrix of rank

r

, all four fundamental subspaces have dimensions determined by

r

. The column space and row space share dimension

r

(despite living in different ambient spaces). The null space takes the remaining

n - r

dimensions of the domain, the left null space takes the remaining

m - r

dimensions of the codomain. Domain dimensions sum to

n

, codomain dimensions sum to

m

Row Rank Equals Column Rank

\dim(\text{Row}\,A) = \dim(\text{Col}\,A) = \text{rank}(A)

Overview

See details

explanationderivationrelated formulasrelated definitions

The row space and column space of any matrix have the same dimension, despite living in different ambient spaces (

\mathbb{R}^n

and

\mathbb{R}^m

respectively). This common value is the rank. The result is unexpected — the rows and columns of a matrix encode different information, yet the number of independent rows always equals the number of independent columns.

Transpose & Symmetry

(9 formulas)

Transpose Definition

(A^T)_{ij} = a_{ji}

The Transpose

See details

explanationrelated formulasrelated definitions

The transpose flips a matrix across its main diagonal, converting rows into columns. An

m \times n

matrix becomes

n \times m

, with the

(i,j)

entry of

A^T

taken from the

(j,i)

entry of

A

Transpose Involution

(A^T)^T = A

The Transpose

See details

explanationrelated formulasrelated definitions

Transposing twice returns the original matrix. The transpose is its own inverse operation.

Transpose of Sum

(A + B)^T = A^T + B^T

The Transpose

See details

explanationconditionsrelated formulasrelated definitions

Transposition distributes over matrix addition. Together with

(cA)^T = cA^T

, this makes transposition a linear operation on the space of matrices.

Transpose of Scalar Multiple

(cA)^T = c\, A^T

The Transpose

See details

explanationrelated formulasrelated definitions

Scalar multiplication commutes with transposition. The scalar passes through unchanged.

Transpose of Product

(AB)^T = B^T A^T

The Transpose

See details

explanationderivationvariantsrelated formulasrelated definitions

The transpose of a product is the product of the transposes in reversed order. The order reversal is essential — it accommodates the dimension matching that the product requires.

Symmetric Matrix Definition

A = A^T \iff a_{ij} = a_{ji} \text{ for all } i, j

Symmetric Matrices

See details

explanationconditionsvariantsrelated formulasrelated definitions

A symmetric matrix equals its own transpose. The matrix is mirrored across its main diagonal, fully determined by entries on and above the diagonal. Symmetric matrices have all-real eigenvalues and admit orthogonal diagonalization.

Skew-Symmetric Matrix Definition

A^T = -A \iff a_{ij} = -a_{ji}

Skew-Symmetric Matrices

See details

explanationconditionsrelated formulasrelated definitions

A skew-symmetric matrix negates under transposition. Setting

i = j

a_{ii} = -a_{ii}

forces every diagonal entry to zero. Real skew-symmetric matrices have eigenvalues that are zero or purely imaginary.

Symmetric Skew Decomposition

A = \tfrac{1}{2}(A + A^T) + \tfrac{1}{2}(A - A^T)

Skew-Symmetric Matrices

See details

explanationconditionsderivationrelated formulasrelated definitions

Every square matrix splits uniquely into a symmetric part

\tfrac{1}{2}(A + A^T)

and a skew-symmetric part

\tfrac{1}{2}(A - A^T)

. The decomposition mirrors how every function of two variables splits into symmetric and antisymmetric components.

Gram Matrix Symmetry

(A^T A)^T = A^T A, \qquad (A A^T)^T = A A^T

The Transpose

See details

explanationderivationrelated formulasrelated definitions

For any matrix

A

of any shape, the products

A^T A

and

A A^T

are symmetric. These Gram matrices appear in least squares, the singular value decomposition, and inner-product computations.

Special Matrix Types

(14 formulas)

Identity Matrix Definition

I_n = [\delta_{ij}], \qquad \delta_{ij} = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}

The Identity Matrix

See details

explanationnotationrelated formulasrelated definitions

The identity matrix has ones on the main diagonal and zeros elsewhere. The Kronecker delta

\delta_{ij}

is shorthand for this pattern. The subscript

n

is dropped when size is clear from context.

Identity Matrix Property

AI = IA = A

The Identity Matrix

See details

explanationconditionsrelated formulasrelated definitions

The identity matrix is the multiplicative identity for matrix multiplication. Multiplying any matrix by

I

on either side returns the original matrix unchanged, provided dimensions are compatible.

Diagonal Matrix Definition

D = \operatorname{diag}(d_1, d_2, \ldots, d_n) \iff d_{ij} = 0 \text{ for } i \neq j

Diagonal Matrices

See details

explanationconditionsvariantsrelated formulasrelated definitions

A diagonal matrix has nonzero entries only on the main diagonal. Diagonal matrices are the easiest to compute with: products, powers, and inverses all reduce to operations on the diagonal entries alone.

Diagonal Matrix Power

D^k = \operatorname{diag}(d_1^k, d_2^k, \ldots, d_n^k)

Diagonal Matrices

See details

explanationrelated formulasrelated definitions

Each diagonal entry is raised to the

k

-th power independently. This trivial behavior is the principal reason diagonalization is so useful: writing

A = PDP^{-1}

converts an expensive matrix power into a cheap diagonal power,

A^k = PD^kP^{-1}

Diagonal Matrix Determinant

\det\bigl(\operatorname{diag}(d_1, \ldots, d_n)\bigr) = d_1 d_2 \cdots d_n

Diagonal Matrices

See details

explanationrelated formulasrelated definitions

The determinant of a diagonal matrix is the product of its diagonal entries. The matrix is invertible precisely when every diagonal entry is nonzero.

Triangular Matrix Determinant

\det(T) = t_{11} \, t_{22} \cdots t_{nn}

Triangular Matrices

See details

explanationconditionsrelated formulasrelated definitions

For an upper or lower triangular matrix, the determinant is the product of its diagonal entries. The eigenvalues are also the diagonal entries, both readable directly without further computation.

Orthogonal Matrix Definition

Q^T Q = Q Q^T = I

Orthogonal Matrices

See details

explanationconditionsvariantsrelated formulasrelated definitions

An orthogonal matrix has its transpose as its inverse. Equivalently, the columns form an orthonormal set, and so do the rows. Orthogonal matrices preserve lengths and angles — they are the linear isometries.

Orthogonal Matrix Determinant

\det(Q) = \pm 1

Orthogonal Matrices

See details

explanationderivationrelated formulasrelated definitions

Every orthogonal matrix has determinant

+1

-1

. The value

+1

corresponds to a rotation and

-1

to an orientation-reversing transformation involving a reflection.

Idempotent Matrix Definition

A^2 = A

Nilpotent and Idempotent Matrices

See details

explanationconditionsrelated formulasrelated definitions

An idempotent matrix is unchanged by squaring — applying it twice equals applying it once. Idempotent matrices are precisely the projections: they project

\mathbb{R}^n

onto their column space along their null space. The eigenvalues are restricted to

0

and

1

Idempotent Rank Trace

A^2 = A \implies \operatorname{rank}(A) = \operatorname{tr}(A)

Nilpotent and Idempotent Matrices

See details

explanationderivationrelated formulasrelated definitions

For an idempotent matrix, the rank equals the trace. Both quantities count the number of eigenvalues equal to

1

— the trace by the eigenvalue-sum identity, the rank as the dimension of the image.

Nilpotent Matrix Definition

A^k = O \quad \text{for some } k \geq 1

Nilpotent and Idempotent Matrices

See details

explanationconditionsrelated formulasrelated definitions

A nilpotent matrix becomes the zero matrix at some power. The smallest such

k

is the index of nilpotency. Every eigenvalue of a nilpotent matrix is zero, forcing both the determinant and the trace to vanish.

Neumann Series Nilpotent

A^k = O \implies (I - A)^{-1} = I + A + A^2 + \cdots + A^{k-1}

Nilpotent and Idempotent Matrices

See details

explanationderivationrelated formulasrelated definitions

When

A

is nilpotent of index

k

I - A

is invertible with inverse equal to a finite geometric series. The series terminates at the

(k-1)

-th term because higher powers are zero.

Involutory Matrix Definition

A^2 = I

Involutory and Permutation Matrices

See details

explanationconditionsvariantsrelated formulasrelated definitions

An involutory matrix is its own inverse. Applying it twice returns every vector to its starting point. The eigenvalues are restricted to

\pm 1

. Reflections are the prototypical examples: reflecting twice across the same line or plane returns the identity.

Cross Product Skew Matrix

\mathbf{a} \times \mathbf{b} = [\mathbf{a}]_\times \mathbf{b}, \qquad [\mathbf{a}]_\times = \begin{pmatrix} 0 & -a_3 & a_2 \\ a_3 & 0 & -a_1 \\ -a_2 & a_1 & 0 \end{pmatrix}

Skew-Symmetric Matrices

See details

explanationconditionsrelated formulasrelated definitions

The cross product in

\mathbb{R}^3

can be expressed as a matrix-vector multiplication. The

3 \times 3

skew-symmetric matrix

[\mathbf{a}]_\times

, built from the components of

\mathbf{a}

, acts on

\mathbf{b}

to produce the cross product. This reformulation lets cross-product computations participate in the algebra of matrix products.

Inverse

(15 formulas)

Inverse Definition

A A^{-1} = A^{-1} A = I

Definition of the Inverse

See details

explanationconditionsderivationrelated formulasrelated definitions

The inverse of a square matrix

A

— when it exists — is the unique matrix

A^{-1}

that produces the identity from both sides. A matrix possessing an inverse is called invertible (or nonsingular); a matrix without one is singular. Uniqueness follows from a short associativity argument.

Inverse 2x2 Formula

\begin{pmatrix} a & b \\ c & d \end{pmatrix}^{-1} = \frac{1}{ad - bc}\begin{pmatrix} d & -b \\ -c & a \end{pmatrix}

The 2×2 Inverse Formula

See details

explanationconditionsrelated formulasrelated definitions

For a

2 \times 2

matrix, the inverse is obtained by swapping the diagonal entries, negating the off-diagonal entries, and dividing by the determinant. This is the smallest case where the inverse has a simple closed form.

Inverse via Adjugate

A^{-1} = \frac{1}{\det(A)}\, \operatorname{adj}(A)

Computing the Inverse via the Adjugate

See details

explanationconditionsnotationrelated formulasrelated definitions

When

A

is invertible, the inverse equals the adjugate divided by the determinant. The adjugate is the transpose of the cofactor matrix, so each entry of

A^{-1}

is a signed minor of

A

divided by

\det(A)

. The formula is exact and fully symbolic, but expensive — for numerical work, row reduction is far cheaper.

Inverse via Row Reduction

[A \mid I] \;\xrightarrow{\text{row ops}}\; [I \mid A^{-1}]

Computing the Inverse by Row Reduction

See details

explanationconditionsderivationrelated formulasrelated definitions

Form the augmented matrix

[A \mid I]

and apply row operations until the left half becomes the identity. The right half then holds

A^{-1}

. If the left half develops a row of zeros at any stage,

A

is singular and no inverse exists.

Inverse Involution

(A^{-1})^{-1} = A

Properties of the Inverse

See details

explanationconditionsrelated formulasrelated definitions

Inverting twice returns the original matrix. Inversion is its own inverse operation, mirroring the analogous property of transposition.

Inverse of Product

(AB)^{-1} = B^{-1} A^{-1}

Properties of the Inverse

See details

explanationconditionsderivationvariantsrelated formulasrelated definitions

The inverse of a product is the product of the inverses in reversed order. The reversal mirrors the rule for transpose of a product: to undo "first

B

, then

A

," undo

A

first, then

B

Inverse of Transpose

(A^T)^{-1} = (A^{-1})^T

Properties of the Inverse

See details

explanationconditionsderivationrelated formulasrelated definitions

Transposing and inverting commute. The two operations can be applied in either order with the same result. A useful corollary: the inverse of a symmetric invertible matrix is symmetric.

Inverse of Scalar Multiple

(cA)^{-1} = \frac{1}{c}\, A^{-1}

Properties of the Inverse

See details

explanationconditionsrelated formulasrelated definitions

Scaling a matrix by a nonzero scalar scales its inverse by the reciprocal. The scalar passes through inversion in the natural way.

Inverse of Power

(A^k)^{-1} = (A^{-1})^k = A^{-k}

Properties of the Inverse

See details

explanationconditionsrelated formulasrelated definitions

The inverse of a power is the power of the inverse, and both equal the corresponding negative power. With this identity, the exponent laws

A^j A^k = A^{j+k}

and

(A^j)^k = A^{jk}

extend to all integers.

Determinant of Inverse

\det(A^{-1}) = \frac{1}{\det(A)}

Properties of the Inverse

See details

explanationconditionsderivationrelated formulasrelated definitions

The determinant of the inverse is the reciprocal of the determinant. The identity provides a quick consistency check on inverse computations.

Diagonal Matrix Inverse

\operatorname{diag}(d_1, \ldots, d_n)^{-1} = \operatorname{diag}\!\left(\frac{1}{d_1}, \ldots, \frac{1}{d_n}\right)

Inverses of Special Matrix Types

See details

explanationconditionsrelated formulasrelated definitions

The inverse of an invertible diagonal matrix is obtained by reciprocating each diagonal entry. The matrix is invertible if and only if every diagonal entry is nonzero.

Orthogonal Matrix Inverse

Q^{-1} = Q^T

Inverses of Special Matrix Types

See details

explanationconditionsrelated formulasrelated definitions

For an orthogonal matrix, the inverse equals the transpose. This is the cheapest matrix inverse to compute: no arithmetic is required, only a re-indexing of entries.

Solve System via Inverse

A\mathbf{x} = \mathbf{b} \implies \mathbf{x} = A^{-1}\mathbf{b}

Solving Systems with the Inverse

See details

explanationconditionsderivationrelated formulasrelated definitions

When the coefficient matrix is invertible, the linear system has the unique solution

A^{-1}\mathbf{b}

. The formula is the matrix analogue of dividing both sides by

A

. Computationally, however, row reduction or LU factorization is preferred — computing

A^{-1}

explicitly is roughly three times more expensive and less numerically stable.

Invertible Matrix Theorem

$$\begin{aligned}
\text{For } A \in \mathbb{R}^{n \times n}, \text{ the following are equivalent:} & \\
(1)\ A \text{ is invertible} \quad (2)\ \det(A) \neq 0 \quad (3)\ \operatorname{rank}(A) = n & \\
(4)\ \text{columns of } A \text{ are linearly independent} & \\
(5)\ \text{rows of } A \text{ are linearly independent} & \\
(6)\ \text{columns of } A \text{ span } \mathbb{R}^n \quad (7)\ \text{columns form a basis of } \mathbb{R}^n & \\
(8)\ A\mathbf{x} = \mathbf{0} \text{ has only the trivial solution} & \\
(9)\ A\mathbf{x} = \mathbf{b} \text{ has a unique solution for every } \mathbf{b} & \\
(10)\ \operatorname{Null}(A) = \{\mathbf{0}\} & \\
(11)\ \operatorname{rref}(A) = I \quad (12)\ A \text{ is a product of elementary matrices} & \\
(13)\ 0 \text{ is not an eigenvalue of } A &
\end{aligned}$$

When Does the Inverse Exist?

See details

explanationconditionsrelated formulasrelated definitions

The Invertible Matrix Theorem collects equivalent characterizations of invertibility, each approaching it from a different angle — algebraic, geometric, computational, spectral. Proving any one condition automatically establishes all the others. Checking the determinant is often the fastest hand test; row reduction is preferred for large-scale computation.

Singular Matrix Definition

A \text{ singular} \iff \det(A) = 0 \iff \operatorname{rank}(A) < n

Singular and Nonsingular Matrices

See details

explanationconditionsrelated formulasrelated definitions

A singular matrix has determinant zero, equivalently rank less than

n

, equivalently no inverse. Its columns are linearly dependent, and as a transformation it collapses at least one dimension — its image is a proper subspace of

\mathbb{R}^n

. Singularity is the negation of invertibility.

Rank

(8 formulas)

Rank Bounds

0 \leq \operatorname{rank}(A) \leq \min(m, n)

What Rank Measures

See details

explanationconditionsvariantsrelated formulasrelated definitions

The rank is bounded above by the smaller dimension of the matrix. When equality holds, the matrix has full rank — every row and every column contributes information that no combination of the others can reproduce. The lower bound zero is achieved only by the zero matrix.

Rank of Transpose

\operatorname{rank}(A^T) = \operatorname{rank}(A)

Properties of Rank

See details

explanationrelated formulasrelated definitions

Transposition preserves rank. This is a restatement of the deeper theorem that the column rank and row rank of any matrix are equal — transposition swaps the roles of rows and columns but leaves the common value unchanged.

Rank Product Inequality

\operatorname{rank}(AB) \leq \min\bigl(\operatorname{rank}(A), \operatorname{rank}(B)\bigr)

Properties of Rank

See details

explanationconditionsrelated formulasrelated definitions

Multiplying two matrices cannot create independent directions that were not already present in both factors. The rank of a product is bounded by the smaller of the two factor ranks.

Sylvester Rank Inequality

\operatorname{rank}(A) + \operatorname{rank}(B) - n \leq \operatorname{rank}(AB)

Properties of Rank

See details

explanationconditionsrelated formulasrelated definitions

Sylvester's lower bound on the rank of a product. The rank cannot drop too far below the sum of factor ranks — at most by

n

, the inner dimension. Combined with the upper bound, this constrains

\operatorname{rank}(AB)

from both sides.

Rank Sum Inequality

\operatorname{rank}(A + B) \leq \operatorname{rank}(A) + \operatorname{rank}(B)

Properties of Rank

See details

explanationconditionsrelated formulasrelated definitions

The rank of a sum is bounded by the sum of the ranks. Equality holds when the column spaces of

A

and

B

intersect only at the origin.

Rank Invariance Invertible

\operatorname{rank}(PAQ) = \operatorname{rank}(A)

Properties of Rank

See details

explanationconditionsrelated formulasrelated definitions

Multiplying

A

by invertible matrices on either side preserves rank exactly. Invertible matrices neither collapse dimensions nor create new ones — they reshape the row and column spaces without altering their dimension.

Gram Rank Identity

\operatorname{rank}(A^T A) = \operatorname{rank}(A A^T) = \operatorname{rank}(A)

Rank of Special Matrices

See details

explanationderivationrelated formulasrelated definitions

The Gram matrices

A^T A

and

A A^T

have the same rank as

A

. This identity underlies least squares: even when

A

is rectangular,

A^T A

has the same rank, and is invertible precisely when

A

has full column rank.

Rank-One Outer Product

A = \mathbf{u}\mathbf{v}^T \implies \operatorname{rank}(A) = 1 \quad (\mathbf{u}, \mathbf{v} \neq \mathbf{0})

Rank of Special Matrices

See details

explanationconditionsrelated formulasrelated definitions

Every rank-one matrix is an outer product

\mathbf{u}\mathbf{v}^T

of two nonzero vectors. Each column of

A

is a scalar multiple of

\mathbf{u}

, so the column space is the one-dimensional line through

\mathbf{u}

. The rows are scalar multiples of

\mathbf{v}^T

. Rank-one matrices are the building blocks of the outer-product expansion of matrix multiplication and of low-rank approximation.

Trace

(11 formulas)

Trace Definition

\operatorname{tr}(A) = \sum_{i=1}^{n} a_{ii} = a_{11} + a_{22} + \cdots + a_{nn}

Definition

See details

explanationconditionsrelated formulasrelated definitions

The trace of a square matrix is the sum of its diagonal entries. Off-diagonal entries play no role. Defined only for square matrices, the trace encodes spectral information that is not obvious from its simple definition: it equals the sum of eigenvalues and is invariant under similarity.

Trace Linearity

\operatorname{tr}(cA + dB) = c\operatorname{tr}(A) + d\operatorname{tr}(B)

Linearity

See details

explanationconditionsrelated formulasrelated definitions

The trace is a linear function from the space of

n \times n

matrices to the scalars. Both additivity (

\operatorname{tr}(A + B) = \operatorname{tr}(A) + \operatorname{tr}(B)

) and scalar homogeneity (

\operatorname{tr}(cA) = c\operatorname{tr}(A)

) follow immediately from summing diagonal entries.

Trace of Transpose

\operatorname{tr}(A^T) = \operatorname{tr}(A)

Linearity

See details

explanationrelated formulasrelated definitions

Transposition leaves the diagonal entries fixed, so the trace is unaffected.

Trace Cyclic Property

\operatorname{tr}(AB) = \operatorname{tr}(BA)

The Cyclic Property

See details

explanationderivationvariantsrelated formulasrelated definitions

The trace is invariant under cyclic permutations of a product. Notably,

AB

and

BA

need not have the same dimensions — if

A

m \times n

and

B

n \times m

, the products are

m \times m

and

n \times n

respectively, yet share the same trace.

Trace Sum of Eigenvalues

\operatorname{tr}(A) = \sum_{i=1}^{n} \lambda_i

Trace and Eigenvalues

See details

explanationconditionsderivationrelated formulasrelated definitions

The trace equals the sum of the eigenvalues, counted with algebraic multiplicity. This identity links a trivially computable quantity to spectral information that ordinarily requires solving a degree-

n

polynomial. The companion identity

\det(A) = \prod \lambda_i

relates the determinant to the product of eigenvalues.

Trace Similarity Invariance

\operatorname{tr}(P^{-1}AP) = \operatorname{tr}(A)

Trace and Similarity

See details

explanationconditionsderivationrelated formulasrelated definitions

Similar matrices have equal trace. The trace is a property of the linear transformation itself, not of any particular matrix representation — the value is independent of the chosen basis.

Trace of Commutator

\operatorname{tr}(AB - BA) = 0

Trace of Commutators

See details

explanationderivationrelated formulasrelated definitions

The commutator

[A, B] = AB - BA

always has trace zero, regardless of the matrices involved. A consequence: the identity matrix

I

can never be written as a commutator over

\mathbb{R}

\mathbb{C}

, since

\operatorname{tr}(I) = n \neq 0

Trace Symmetric Skew Product

\operatorname{tr}(SK) = 0 \quad (S^T = S, \; K^T = -K)

Trace Identities

See details

explanationderivationrelated formulasrelated definitions

The trace of a product of a symmetric matrix and a skew-symmetric matrix vanishes. The identity reflects an orthogonality between symmetric and skew-symmetric subspaces under the Frobenius inner product.

Trace Orthonormal Basis

\operatorname{tr}(A) = \sum_{i=1}^{n} \mathbf{q}_i^T A \mathbf{q}_i

Trace Identities

See details

explanationconditionsrelated formulasrelated definitions

For any orthonormal basis

\{\mathbf{q}_1, \ldots, \mathbf{q}_n\}

\mathbb{R}^n

, the trace can be computed as the sum of quadratic forms

\mathbf{q}_i^T A \mathbf{q}_i

. The result is independent of which orthonormal basis is used — another manifestation of similarity invariance, since change of orthonormal basis is an orthogonal similarity.

Frobenius Inner Product

\langle A, B \rangle_F = \operatorname{tr}(A^T B) = \sum_{i,j} a_{ij} b_{ij}

The Frobenius Inner Product

See details

explanationconditionsrelated formulasrelated definitions

The trace defines an inner product on the space of matrices. It is the dot product of

A

and

B

viewed as vectors of

n^2

entries. It is symmetric, bilinear, and positive definite — bringing geometric concepts (angles, orthogonality, projections) to bear on matrices themselves.

Frobenius Norm

\|A\|_F = \sqrt{\operatorname{tr}(A^T A)} = \sqrt{\sum_{i,j} a_{ij}^2}

The Frobenius Inner Product

See details

explanationrelated formulasrelated definitions

The Frobenius norm measures the total size of a matrix as the square root of the sum of squares of all entries — the matrix analogue of Euclidean length. Induced by the Frobenius inner product, it is one of several common matrix norms (alongside the operator norm and nuclear norm) and is the simplest to compute.

Definitions

(4 formulas)

Determinant 2x2

\det\begin{pmatrix} a & b \\ c & d \end{pmatrix} = ad - bc

The 2×2 Formula

See details

explanationconditionsvariantsrelated formulasrelated definitions

For a

2 \times 2

matrix, the determinant is the product of the main diagonal minus the product of the anti-diagonal. The matrix is invertible exactly when this number is nonzero.

Determinant 3x3

\det(A) = a_{11}(a_{22}a_{33} - a_{23}a_{32}) - a_{12}(a_{21}a_{33} - a_{23}a_{31}) + a_{13}(a_{21}a_{32} - a_{22}a_{31})

The 3×3 Formula

See details

explanationconditionsvariantsrelated formulasrelated definitions

The expansion along the first row: each entry

a_{1j}

multiplies the

2 \times 2

determinant of the submatrix obtained by deleting row

1

and column

j

. The signs alternate

+, -, +

Determinant Recursive Definition

\det(A) = \begin{cases} a_{11} & n = 1 \\ \displaystyle\sum_{j=1}^{n} (-1)^{1+j} \, a_{1j} \, M_{1j} & n \geq 2 \end{cases}

The General n×n Determinant

See details

explanationconditionsnotationrelated formulasrelated definitions

The recursive definition: an

n \times n

determinant reduces to

n

determinants of size

(n-1) \times (n-1)

, each of which reduces further until reaching

1 \times 1

matrices. The expansion above uses row

1

, but any row or column gives the same value.

Determinant Permutation Formula

\det(A) = \sum_{\sigma \in S_n} \operatorname{sgn}(\sigma) \, a_{\sigma(1),1} \, a_{\sigma(2),2} \cdots a_{\sigma(n),n}

Transpose Invariance

See details

explanationnotationconditionsvariantsrelated formulasrelated definitions

The closed-form non-recursive definition: sum over all

n!

permutations of

\{1, 2, \ldots, n\}

, weighting each by the permutation's sign and the product of

n

entries it selects (one per row and one per column).

Cofactor Structure

(4 formulas)

Minor Definition

M_{ij} = \det\!\left(A^{(i,j)}\right)

Minors

See details

explanationnotationconditionsrelated formulasrelated definitions

The

(i,j)

minor of

A

is the determinant of the

(n-1) \times (n-1)

submatrix obtained by deleting row

i

and column

j

. The minor is itself a determinant — a scalar, not a matrix.

Cofactor Definition

C_{ij} = (-1)^{i+j} \, M_{ij}

Cofactors and the Sign Pattern

See details

explanationrelated formulasrelated definitions

The

(i,j)

cofactor is the signed minor. The factor

(-1)^{i+j}

produces a checkerboard pattern starting with

+

at position

(1,1)

and alternating from there. The position alone determines the sign — the entries of

A

play no role.

Cofactor Matrix Definition

\operatorname{cof}(A) = \bigl[C_{ij}\bigr]_{n \times n}

The Cofactor Matrix

See details

explanationrelated formulasrelated definitions

The cofactor matrix has entry

C_{ij}

at position

(i,j)

. It is not the matrix of minors — the alternating sign factors are already incorporated. Each row of

\operatorname{cof}(A)

contains the cofactors needed for Laplace expansion along the corresponding row of

A

, and each column contains those needed for column expansion.

Adjugate Definition

\operatorname{adj}(A) = \operatorname{cof}(A)^T

The Adjugate

See details

explanationrelated formulasrelated definitions

The adjugate (also called the classical adjoint) is the transpose of the cofactor matrix. Equivalently,

[\operatorname{adj}(A)]_{ij} = C_{ji}

— the

(i,j)

entry of the adjugate is the

(j,i)

cofactor of

A

Cofactor Expansion

(2 formulas)

Laplace Row Expansion

\det(A) = \sum_{j=1}^{n} a_{ij} \, C_{ij} \qquad \text{for any fixed row } i

Laplace Expansion Along a Row

See details

explanationconditionsvariantsrelated formulasrelated definitions

The determinant equals the sum of each entry in row

i

multiplied by its cofactor, regardless of which row is chosen. The freedom to pick any row makes the formula practical: a row with many zeros eliminates entire sub-determinants from the sum.

Laplace Column Expansion

\det(A) = \sum_{i=1}^{n} a_{ij} \, C_{ij} \qquad \text{for any fixed column } j

Laplace Expansion Along a Column

See details

explanationconditionsvariantsrelated formulasrelated definitions

The determinant equals the sum of each entry in column

j

multiplied by its cofactor, for any choice of column. Column expansion gives the same result as row expansion — a consequence of

\det(A^T) = \det(A)

, which lets every column expansion be reinterpreted as a row expansion on the transpose.

Row Operation Effects

(3 formulas)

Determinant Row Swap

\det(B) = -\det(A)

Effect of Row Swaps

See details

explanationconditionsderivationrelated formulasrelated definitions

Swapping two rows of a matrix flips the sign of its determinant. The same rule holds for column swaps, by transpose invariance.

Determinant Row Scaling

\det(B) = k \, \det(A)

Effect of Row Scaling

See details

explanationconditionsrelated formulasrelated definitions

Multiplying a single row by a scalar

k

multiplies the determinant by

k

. The same rule applies to columns. A common factor in any one row can be pulled out in front of the determinant — useful for hand simplification before further computation.

Determinant Row Addition

\det(B) = \det(A)

Effect of Row Addition

See details

explanationconditionsrelated formulasrelated definitions

Adding a scalar multiple of one row to a different row leaves the determinant completely unchanged. This is the operation that does the heavy lifting in Gaussian elimination, and it costs nothing in determinant terms — making row reduction the practical method for computing determinants of large matrices.

Algebraic Properties

(5 formulas)

Determinant of Transpose

\det(A^T) = \det(A)

Transpose Invariance

See details

explanationderivationrelated formulasrelated definitions

Transposing a matrix does not change its determinant. The practical consequence is that every row-based property of the determinant has a column-based counterpart: column swap flips sign, column scaling scales the determinant, column expansion equals row expansion.

Determinant of Product

\det(AB) = \det(A) \, \det(B)

The Multiplicative Property

See details

explanationconditionsvariantsrelated formulasrelated definitions

The determinant of a product equals the product of determinants — one of the most powerful structural facts about determinants. Geometrically, composing linear maps multiplies their volume-scaling factors; algebraically, this identity unlocks corollaries for inverses, powers, and similar matrices.

Determinant of Scalar Multiple

\det(kA) = k^n \, \det(A)

Effect of Row Scaling

See details

explanationconditionsderivationrelated formulasrelated definitions

Scaling the entire matrix by

k

scales the determinant by

k^n

, not by

k

. The factor passes through each of the

n

rows independently — a common error is to forget the exponent. Distinct from row scaling, which scales only one row and contributes a single factor of

k

Determinant of Power

\det(A^k) = \bigl(\det(A)\bigr)^k

The Multiplicative Property

See details

explanationconditionsrelated formulasrelated definitions

The determinant of a matrix power is the corresponding power of the determinant. Direct corollary of the multiplicative property applied

k

times.

Determinant of Identity

\det(I_n) = 1

Triangular and Diagonal Matrices

See details

explanationrelated formulasrelated definitions

The identity matrix has determinant

1

. Follows directly from the diagonal-product rule for triangular matrices:

I_n

is diagonal with every diagonal entry equal to

1

Special Determinants

(2 formulas)

Block Triangular Determinant

\det\begin{pmatrix} A_{11} & A_{12} \\ 0 & A_{22} \end{pmatrix} = \det(A_{11}) \, \det(A_{22})

Block Triangular Matrices

See details

explanationconditionsvariantsrelated formulasrelated definitions

For a block upper triangular matrix with square diagonal blocks, the determinant factors as the product of the diagonal-block determinants. The off-diagonal block

A_{12}

contributes nothing — only the triangular placement of the zero block matters.

Vandermonde Determinant

\det(V) = \prod_{1 \leq i < j \leq n} (x_j - x_i)

Vandermonde and Structured Determinants

See details

explanationnotationconditionsrelated formulasrelated definitions

The determinant of a Vandermonde matrix is the product of all pairwise differences of the nodes. It is nonzero precisely when all nodes are distinct — the algebraic foundation guaranteeing that a polynomial of degree at most

n-1

is uniquely determined by its values at

n

distinct points.

Adjugate & Inverse

(1 formula)

Adjugate Identity

A \cdot \operatorname{adj}(A) = \operatorname{adj}(A) \cdot A = \det(A) \, I

The Adjugate

See details

explanationderivationrelated formulasrelated definitions

The product of

A

with its adjugate equals the determinant times the identity. This is the structural identity that yields the explicit inverse formula

A^{-1} = \operatorname{adj}(A) / \det(A)

whenever

\det(A) \neq 0

, and it holds for every square matrix — invertible or not.

Linear Systems

(1 formula)

Cramers Rule

x_i = \frac{\det(A_i)}{\det(A)}

Cramer

See details

explanationnotationconditionsrelated formulasrelated definitions

For a square linear system

A\mathbf{x} = \mathbf{b}

with nonzero determinant, each component of the solution is a ratio of determinants. The numerator uses a modified version of

A

with column

i

replaced by the right-hand side. Of theoretical importance — the solution is a rational function of the data — but computationally expensive:

n+1

determinant evaluations versus the

O(n^3)

cost of Gaussian elimination.

Geometric Interpretation

(5 formulas)

Determinant Signed Area 2D

\text{signed area}(\mathbf{u}, \mathbf{v}) = \det\begin{pmatrix} \mathbf{u} & \mathbf{v} \end{pmatrix} = ad - bc

Signed Area in Two Dimensions

See details

explanationnotationconditionsrelated formulasrelated definitions

The determinant of a

2 \times 2

matrix equals the signed area of the parallelogram spanned by its columns. Positive value means the columns are counterclockwise-ordered, negative means clockwise, zero means parallel.

Determinant Signed Volume 3D

\text{signed volume}(\mathbf{a}, \mathbf{b}, \mathbf{c}) = \det\begin{pmatrix} \mathbf{a} & \mathbf{b} & \mathbf{c} \end{pmatrix} = \mathbf{a} \cdot (\mathbf{b} \times \mathbf{c})

Signed Volume in Three Dimensions

See details

explanationconditionsrelated formulasrelated definitions

The

3 \times 3

determinant equals the signed volume of the parallelepiped spanned by its column vectors, identical to the scalar triple product. Positive value means right-handed system, negative means left-handed, zero means coplanar.

Determinant Volume Scaling Factor

\operatorname{vol}\bigl(A(S)\bigr) = |\det(A)| \cdot \operatorname{vol}(S)

The General Case: n-Dimensional Volume Scaling

See details

explanationnotationconditionsrelated formulasrelated definitions

The linear map

\mathbf{x} \mapsto A\mathbf{x}

scales every

n

-dimensional region by the factor

|\det(A)|

. When

|\det(A)| > 1

volumes expand; when

0 < |\det(A)| < 1

they compress; when

|\det(A)| = 1

they are preserved (rotations and reflections); when

\det(A) = 0

the image collapses to a lower-dimensional subspace.

Triangle Area via Determinant

\text{Area} = \frac{1}{2} \left| \det\begin{pmatrix} x_2 - x_1 & x_3 - x_1 \\ y_2 - y_1 & y_3 - y_1 \end{pmatrix} \right|

Area and Volume Formulas from Determinants

See details

explanationconditionsrelated formulasrelated definitions

The area of a triangle with vertices

(x_1, y_1)

(x_2, y_2)

(x_3, y_3)

is half the absolute value of the determinant whose columns are the edge vectors from vertex

1

to the other two. The triangle occupies exactly half the parallelogram spanned by these edges.

Tetrahedron Volume via Determinant

V = \frac{1}{6} \left| \det\begin{pmatrix} \mathbf{e}_1 & \mathbf{e}_2 & \mathbf{e}_3 \end{pmatrix} \right|

Area and Volume Formulas from Determinants

See details

explanationnotationconditionsrelated formulasrelated definitions

The volume of a tetrahedron in

\mathbb{R}^3

equals one-sixth the absolute value of the determinant whose columns are the three edge vectors emanating from any single chosen vertex. The factor

\frac{1}{6}

arises because a tetrahedron occupies one-sixth of the parallelepiped spanned by its three edges.

Eigenvalue Connection

(1 formula)

Determinant Product of Eigenvalues

\det(A) = \lambda_1 \, \lambda_2 \cdots \lambda_n

The Characteristic Polynomial

See details

explanationderivationconditionsrelated formulasrelated definitions

The determinant equals the product of all eigenvalues, counted with algebraic multiplicity. Combined with the trace identity

\operatorname{tr}(A) = \sum \lambda_i

, this links the determinant and trace to the eigenvalue spectrum and gives an immediate invertibility criterion:

A

is invertible if and only if no eigenvalue is zero.

Standard Forms

(4 formulas)

Linear Equation Standard Form

a_1 x_1 + a_2 x_2 + \cdots + a_n x_n = b

What a Linear System Is

See details

explanationnotationconditionsrelated formulasrelated definitions

A single linear equation in

n

unknowns. Each unknown

x_i

appears to the first power and is multiplied by a scalar coefficient

a_i

. The right-hand side

b

is a constant. No products of unknowns, no powers above one, no transcendental functions.

Linear System Matrix Form

A\mathbf{x} = \mathbf{b}

Writing a System in Matrix Form

See details

explanationnotationvariantsrelated formulasrelated definitions

Compresses an entire system of

m

equations in

n

unknowns into a single matrix equation. Each row of

A

encodes one equation; each column corresponds to one unknown. Asking whether the system has a solution becomes asking whether

\mathbf{b}

lies in the column space of

A

Vector Equation Form

x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_n = \mathbf{b}

Writing a System in Matrix Form

See details

explanationnotationrelated formulasrelated definitions

Recasts the system as a linear combination of the columns of

A

, weighted by the entries of

\mathbf{x}

. Each

\mathbf{a}_j

is the

j

-th column of

A

. The system has a solution exactly when

\mathbf{b}

can be assembled from the columns — that is, when

\mathbf{b}

lies in their span.

Augmented Matrix Construction

[A \mid \mathbf{b}] = \left(\begin{array}{cccc|c} a_{11} & a_{12} & \cdots & a_{1n} & b_1 \\ a_{21} & a_{22} & \cdots & a_{2n} & b_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} & b_m \end{array}\right)

[A \mid \mathbf{b}] = \left(\begin{array}{cccc|c} a_{11} & a_{12} & \cdots & a_{1n} & b_1 \\ a_{21} & a_{22} & \cdots & a_{2n} & b_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} & b_m \end{array}\right)

The Augmented Matrix

See details

explanationconditionsrelated formulasrelated definitions

Packages the coefficient matrix

A

and the right-hand side

\mathbf{b}

into a single

m \times (n+1)

matrix. The vertical bar is purely notational. Row operations performed on

[A \mid \mathbf{b}]

correspond directly to legal manipulations of the underlying equations.

Echelon Forms

(4 formulas)

Row Echelon Form Definition

\text{REF: } \begin{pmatrix} \boxed{\ast} & \bullet & \bullet & \bullet & \bullet \\ 0 & \boxed{\ast} & \bullet & \bullet & \bullet \\ 0 & 0 & 0 & \boxed{\ast} & \bullet \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

Row Echelon Form

See details

explanationconditionsrelated formulasrelated definitions

A matrix is in row echelon form if (1) every zero row sits at the bottom, (2) the leading nonzero entry of each row (the pivot,

\boxed{\ast}

) is strictly to the right of the pivot in the row above, and (3) every entry below a pivot is zero. Bullets

\bullet

denote arbitrary entries.

Reduced Row Echelon Form Definition

\text{RREF: } \begin{pmatrix} \boxed{1} & 0 & \bullet & 0 & \bullet \\ 0 & \boxed{1} & \bullet & 0 & \bullet \\ 0 & 0 & 0 & \boxed{1} & \bullet \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix}

Reduced Row Echelon Form

See details

explanationconditionsrelated formulasrelated definitions

RREF satisfies all REF conditions plus two more: every pivot equals

1

, and every pivot is the only nonzero entry in its column (zeros above and below). Each pivot column becomes a unit vector. Free columns (bullets) can contain anything.

RREF Uniqueness

\text{RREF}(A) \text{ is unique}

Uniqueness of RREF

See details

explanationconditionsnotationrelated formulasrelated definitions

Every matrix has exactly one reduced row echelon form. No matter which sequence of row operations is used to reach RREF, the result is identical. This makes pivot positions, rank, and free-variable structure intrinsic properties of the matrix.

Pivot Definition

\text{pivot} = \text{leading nonzero entry of a row in echelon form}

Row Echelon Form

See details

explanationnotationrelated formulasrelated definitions

The pivot of a nonzero row in echelon form is its leftmost nonzero entry. The column containing a pivot is a pivot column; all other columns are free columns. The number of pivots equals the rank of the matrix.

Elementary Row Operations

(2 formulas)

Elementary Row Operations

\begin{aligned} R_i &\leftrightarrow R_j \quad \text{(swap)} \\ kR_i &\to R_i, \quad k \neq 0 \quad \text{(scaling)} \\ R_i + cR_j &\to R_i \quad \text{(addition)} \end{aligned}

The Three Elementary Row Operations

See details

explanationconditionsnotationrelated formulasrelated definitions

The three operations that transform an augmented matrix without altering the solution set. Row swap reorders equations. Row scaling rescales an equation by a nonzero factor. Row addition replaces a row with itself plus a multiple of another row — this is the operation that performs elimination.

Row Equivalence Preserves Solutions

[A \mid \mathbf{b}] \sim [A' \mid \mathbf{b}'] \;\Longrightarrow\; \text{Sol}(A\mathbf{x} = \mathbf{b}) = \text{Sol}(A'\mathbf{x} = \mathbf{b}')

Elementary Row Operations

See details

explanationconditionsnotationrelated formulasrelated definitions

If two augmented matrices are row-equivalent (one is reachable from the other by a finite sequence of elementary row operations), their associated linear systems have identical solution sets. This is what justifies Gaussian elimination as a solution method — every step preserves the answer.

Solvability

(3 formulas)

Free Variables Count

\text{(number of free variables)} = n - \text{rank}(A)

Pivot Columns and Free Columns

See details

explanationnotationrelated formulasrelated definitions

In the echelon form of an

m \times n

coefficient matrix, the number of pivot columns equals

\text{rank}(A)

. The remaining

n - \text{rank}(A)

columns are free, and each contributes one free parameter to the solution. When this count is zero, the solution (if it exists) is unique; when positive, the solution set is infinite.

Solvability Rank Criterion

A\mathbf{x} = \mathbf{b} \text{ is consistent} \iff \text{rank}(A) = \text{rank}([A \mid \mathbf{b}])

The Rouché-Capelli Theorem

See details

explanationconditionsrelated formulasrelated definitions

A linear system has at least one solution if and only if appending

\mathbf{b}

as a column to the coefficient matrix does not increase the rank. Equivalently,

\mathbf{b}

must lie in the column space of

A

. Also known as the Rouché-Capelli theorem (or Kronecker-Capelli theorem).

Solution Structure Decomposition

\mathbf{x} = \mathbf{x}_p + \mathbf{x}_h, \quad \mathbf{x}_h \in \text{Null}(A)

Structure of the Solution Set

See details

explanationconditionsnotationrelated formulasrelated definitions

Every solution to a consistent non-homogeneous system

A\mathbf{x} = \mathbf{b}

decomposes into a particular solution

\mathbf{x}_p

plus a homogeneous component

\mathbf{x}_h

from the null space of

A

. The particular solution accounts for

\mathbf{b}

; the null-space component accounts for the freedom. The full solution set is an affine subspace — the null space translated by

\mathbf{x}_p

Homogeneous Systems

(2 formulas)

Homogeneous Solution Space Dimension

\dim \text{Null}(A) = n - \text{rank}(A)

The Solution Set Is the Null Space

See details

explanationconditionsnotationrelated formulasrelated definitions

The solution set of a homogeneous system

A\mathbf{x} = \mathbf{0}

equals the null space of

A

— a subspace of

\mathbb{R}^n

. Its dimension (the nullity of

A

) is

n

minus the rank. When the nullity is zero, only the trivial solution exists; when positive, the solution set is a flat through the origin of that dimension.

Underdetermined Homogeneous Has Nontrivial

n > m \;\Longrightarrow\; A\mathbf{x} = \mathbf{0} \text{ has a nontrivial solution}

When Do Nontrivial Solutions Exist?

See details

explanationconditionsnotationrelated formulasrelated definitions

A homogeneous system with more unknowns than equations always has a nonzero solution. The rank of an

m \times n

matrix cannot exceed

m

, so when

n > m

the rank is strictly less than

n

, leaving at least

n - m

free variables.

Definition & Properties

(3 formulas)

Linear Transformation Definition

T(c\mathbf{u} + d\mathbf{v}) = cT(\mathbf{u}) + dT(\mathbf{v})

What a Linear Transformation Is

See details

explanationconditionsvariantsrelated formulasrelated definitions

A function

T: V \to W

between vector spaces is linear when it preserves both vector addition and scalar multiplication. The single combined condition above packages both — setting

c = d = 1

recovers additivity

T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})

, setting

d = 0

recovers homogeneity

T(c\mathbf{v}) = cT(\mathbf{v})

. Linearity extends to arbitrary linear combinations:

T(\sum c_i \mathbf{v}_i) = \sum c_i T(\mathbf{v}_i)

Zero Vector Preservation

T(\mathbf{0}_V) = \mathbf{0}_W

Consequences of Linearity

See details

explanationconditionsderivationvariantsrelated formulasrelated definitions

Every linear transformation sends the zero vector of the domain to the zero vector of the codomain. This is the fastest necessary check for linearity: if

T(\mathbf{0}) \neq \mathbf{0}

, the function cannot be linear. Translations

T(\mathbf{v}) = \mathbf{v} + \mathbf{b}

with

\mathbf{b} \neq \mathbf{0}

fail this test immediately.

Composition Is Linear

(S \circ T)(\mathbf{u}) = S(T(\mathbf{u})), \qquad [S \circ T] = [S]\,[T]

Composition

See details

explanationconditionsderivationrelated formulasrelated definitions

The composition of two linear transformations is itself linear. If

T: U \to V

has matrix

A

and

S: V \to W

has matrix

B

, then

S \circ T: U \to W

has matrix

BA

. The matrix-multiplication order matches the composition order:

S

acts after

T

, so

B

multiplies on the left. This is the structural reason matrix multiplication is defined as it is — it encodes function composition exactly.

Matrix Representation

(3 formulas)

Standard Matrix

A = \bigl[\,T(\mathbf{e}_1) \;\; T(\mathbf{e}_2) \;\; \cdots \;\; T(\mathbf{e}_n)\,\bigr]

Constructing the Standard Matrix

See details

explanationconditionsderivationrelated formulasrelated definitions

For a linear transformation

T: \mathbb{R}^n \to \mathbb{R}^m

, the standard matrix has the images of the standard basis vectors as its columns. Column

j

T(\mathbf{e}_j)

— the image of the

j

-th standard basis vector. Together with Linear Map as Matrix Multiplication, this gives a one-to-one correspondence between linear maps

\mathbb{R}^n \to \mathbb{R}^m

and

m \times n

matrices.

Linear Map as Matrix Multiplication

T(\mathbf{x}) = A\mathbf{x} \quad \text{for every } \mathbf{x} \in \mathbb{R}^n

Every Linear Map from Rⁿ to Rᵐ Is Matrix Multiplication

See details

explanationconditionsnotationrelated formulasrelated definitions

Every linear transformation

T: \mathbb{R}^n \to \mathbb{R}^m

is matrix multiplication by a unique

m \times n

matrix

A

. This is not an optional representation — it is forced by linearity. Conversely, every

m \times n

matrix defines a linear transformation. Linear maps and matrices are the same objects viewed from two perspectives.

Matrix Representation Abstract Bases

[T]_{\mathcal{C} \leftarrow \mathcal{B}} = \bigl[\,[T(\mathbf{v}_1)]_{\mathcal{C}} \;\; \cdots \;\; [T(\mathbf{v}_n)]_{\mathcal{C}}\,\bigr]

Matrices for Abstract Vector Spaces

See details

explanationconditionsnotationrelated formulasrelated definitions

For a linear transformation

T: V \to W

between abstract vector spaces, the matrix depends on a choice of basis

\mathcal{B} = \{\mathbf{v}_1, \ldots, \mathbf{v}_n\}

for

V

and basis

\mathcal{C}

for

W

. Column

j

is the

\mathcal{C}

-coordinate vector of

T(\mathbf{v}_j)

— the scalars that express

T(\mathbf{v}_j)

as a linear combination of

\mathcal{C}

-basis vectors. The standard matrix is the special case where both bases are standard.

Image & Kernel

(5 formulas)

Image Definition

\text{Im}(T) = \{T(\mathbf{v}) : \mathbf{v} \in V\} \subseteq W

The Image

See details

explanationconditionsvariantsrelated formulasrelated definitions

The image (or range) of a linear transformation is the set of all output vectors. It is a subspace of the codomain

W

. For matrix transformations

T(\mathbf{x}) = A\mathbf{x}

, the image is the column space of

A

, and its dimension is

\text{rank}(A)

. The image answers the reachability question:

\mathbf{w} \in \text{Im}(T)

iff

A\mathbf{x} = \mathbf{w}

has a solution.

Kernel Definition

\ker(T) = \{\mathbf{v} \in V : T(\mathbf{v}) = \mathbf{0}\} \subseteq V

The Kernel

See details

explanationconditionsvariantsrelated formulasrelated definitions

The kernel (or null space) of a linear transformation is the set of inputs that map to zero. It is a subspace of the domain

V

. For matrix transformations

T(\mathbf{x}) = A\mathbf{x}

, the kernel is the null space of

A

— all solutions to the homogeneous system

A\mathbf{x} = \mathbf{0}

. The kernel measures the information lost by

T

Injectivity Kernel Criterion

T \text{ injective} \iff \ker(T) = \{\mathbf{0}\}

Injectivity

See details

explanationderivationvariantsrelated formulasrelated definitions

For linear transformations, injectivity reduces to a single check on the kernel. Equivalently for matrix transformations:

T(\mathbf{x}) = A\mathbf{x}

is injective iff

A

has full column rank, iff every column is a pivot column, iff the columns are linearly independent.

Rank-Nullity for Maps

\dim\text{Im}(T) + \dim\ker(T) = \dim V

The Rank-Nullity Theorem for Maps

See details

explanationconditionsvariantsrelated formulasrelated definitions

For a linear transformation

T: V \to W

with

V

finite-dimensional, the dimensions of the image and kernel sum to the dimension of the domain. The

\dim V

degrees of freedom split between what survives the map (image) and what is annihilated (kernel). This is the abstract version of the matrix rank-nullity theorem.

Bijectivity Equal Dim Case

\dim V = \dim W \Rightarrow \bigl(T \text{ injective} \iff T \text{ surjective} \iff T \text{ bijective}\bigr)

Bijectivity and Isomorphisms

See details

explanationderivationvariantsrelated formulasrelated definitions

When the domain and codomain have the same finite dimension, the three conditions collapse — verifying any one establishes the others. A bijective linear transformation is an isomorphism; the two spaces are structurally identical as vector spaces. For square matrices, bijectivity corresponds exactly to invertibility.

Similarity & Basis Change

(3 formulas)

Similarity Relation

A' = P^{-1} A P

The Similarity Relation

See details

explanationconditionsderivationrelated formulasrelated definitions

If a linear operator

T: V \to V

has matrix

A

in basis

\mathcal{B}

and matrix

A'

in basis

\mathcal{C}

, then

A

and

A'

are related by similarity via the change-of-basis matrix

P = P_{\mathcal{C} \leftarrow \mathcal{B}}

. Two matrices satisfying this relation for some invertible

P

are called similar — they represent the same linear transformation in different coordinate systems.

Similarity Invariants

A' = P^{-1}AP \Rightarrow \begin{cases} \det(A') = \det(A) \\ \text{tr}(A') = \text{tr}(A) \\ \text{rank}(A') = \text{rank}(A) \\ \text{eigenvalues}(A') = \text{eigenvalues}(A) \end{cases}

A' = P^{-1}AP \Rightarrow \begin{cases} \det(A') = \det(A) \\ \text{tr}(A') = \text{tr}(A) \\ \text{rank}(A') = \text{rank}(A) \\ \text{eigenvalues}(A') = \text{eigenvalues}(A) \end{cases}

Properties Preserved by Similarity

See details

explanationderivationvariantsrelated formulasrelated definitions

Similar matrices share every property intrinsic to the underlying linear transformation, but not properties tied to a specific coordinate representation. Determinant, trace, rank, eigenvalues (with multiplicities), and the characteristic polynomial are all preserved. Individual matrix entries, symmetry, and sparsity are generally not preserved — a symmetric

A

can become non-symmetric under

P^{-1}AP

unless

P

is orthogonal.

Diagonalization Formula

A = P D P^{-1}, \quad D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n), \quad P = [\mathbf{v}_1 \;\cdots\; \mathbf{v}_n]

Diagonalization as a Change of Basis

See details

explanationconditionsvariantsrelated formulasrelated definitions

When a linear operator has

n

linearly independent eigenvectors, using them as a basis makes the operator's matrix diagonal:

T(\mathbf{v}_i) = \lambda_i \mathbf{v}_i

. The matrix

P

has the eigenvectors as columns,

D

is the diagonal matrix of corresponding eigenvalues, and similarity gives

A = PDP^{-1}

. Diagonalization reduces matrix powers and exponentials to trivial diagonal operations.

Geometric Transformations

(7 formulas)

Rotation Matrix 2D

R_\theta = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}

Rotations in R²

See details

explanationconditionsvariantsrelated formulasrelated definitions

Counterclockwise rotation by angle

\theta

about the origin in

\mathbb{R}^2

. The first column

(\cos\theta, \sin\theta)

is the image of

\mathbf{e}_1

— the point on the unit circle at angle

\theta

. The second column is the image of

\mathbf{e}_2

— the point at angle

\theta + 90°

. Determinant

\cos^2\theta + \sin^2\theta = 1

confirms orientation- and area-preserving.

Rotation Matrices 3D

R_x(\theta) = \begin{pmatrix} 1 & 0 & 0 \\ 0 & \cos\theta & -\sin\theta \\ 0 & \sin\theta & \cos\theta \end{pmatrix}, \;\; R_y(\theta) = \begin{pmatrix} \cos\theta & 0 & \sin\theta \\ 0 & 1 & 0 \\ -\sin\theta & 0 & \cos\theta \end{pmatrix}, \;\; R_z(\theta) = \begin{pmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{pmatrix}

R_x(\theta) = \begin{pmatrix} 1 & 0 & 0 \\ 0 & \cos\theta & -\sin\theta \\ 0 & \sin\theta & \cos\theta \end{pmatrix}, \;\; R_y(\theta) = \begin{pmatrix} \cos\theta & 0 & \sin\theta \\ 0 & 1 & 0 \\ -\sin\theta & 0 & \cos\theta \end{pmatrix}, \;\; R_z(\theta) = \begin{pmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{pmatrix}

Rotations in R³

See details

explanationconditionsvariantsrelated formulasrelated definitions

Rotations in

\mathbb{R}^3

about each coordinate axis. The fixed axis appears as a

1

on the diagonal; the other two coordinates form a

2 \times 2

rotation block. The axis of rotation is the eigenvector with eigenvalue

1

— the direction left unchanged. Every

3 \times 3

rotation is orthogonal with determinant

+1

Reflection Across Line 2D

H_\alpha = \begin{pmatrix} \cos 2\alpha & \sin 2\alpha \\ \sin 2\alpha & -\cos 2\alpha \end{pmatrix}

Reflections in R²

See details

explanationconditionsvariantsrelated formulasrelated definitions

Reflection across the line through the origin at angle

\alpha

from the positive

x

-axis. Determinant

-\cos^2 2\alpha - \sin^2 2\alpha = -1

(orientation-reversing). The matrix is orthogonal and involutory:

H_\alpha^2 = I

— reflecting twice returns every vector. Eigenvalues are

+1

(vectors along the mirror line) and

-1

(vectors perpendicular to it).

Householder Reflection

H = I - 2\,\mathbf{n}\mathbf{n}^T

Reflections in R³

See details

explanationconditionsderivationvariantsrelated formulasrelated definitions

Reflection across the hyperplane through the origin with unit normal

\mathbf{n}

. The matrix subtracts twice the component of each vector in the direction of

\mathbf{n}

, effectively mirroring across the perpendicular plane. Householder reflections work in any dimension and are the building blocks of QR decomposition.

Projection onto Line

P = \frac{\mathbf{u}\mathbf{u}^T}{\mathbf{u}^T\mathbf{u}}

Projections

See details

explanationconditionsderivationrelated formulasrelated definitions

Orthogonal projection onto the line through the origin in direction

\mathbf{u}

. The matrix has rank

1

(image is the line spanned by

\mathbf{u}

). The outer product

\mathbf{u}\mathbf{u}^T

is divided by

\mathbf{u}^T\mathbf{u} = \|\mathbf{u}\|^2

to normalize. When

\mathbf{u}

is a unit vector, the formula simplifies to

P = \mathbf{u}\mathbf{u}^T

Projection onto Plane

P = I - \frac{\mathbf{n}\mathbf{n}^T}{\mathbf{n}^T\mathbf{n}}

Projections

See details

explanationconditionsvariantsrelated formulasrelated definitions

Orthogonal projection onto the plane through the origin with normal vector

\mathbf{n}

. The formula subtracts the component along

\mathbf{n}

from each input, leaving only the perpendicular component (which lies in the plane). Closely related to the Householder reflection — projection subtracts the

\mathbf{n}

-component once; reflection subtracts it twice.

Shear Matrix

\text{Shear}_x = \begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix}, \qquad \text{Shear}_y = \begin{pmatrix} 1 & 0 \\ k & 1 \end{pmatrix}

Shears

See details

explanationconditionsvariantsrelated formulasrelated definitions

A shear displaces each point in proportion to its distance from a fixed line. The horizontal shear

\text{Shear}_x

shifts the

x

-coordinate by

k

times the

y

-coordinate; the vertical shear

\text{Shear}_y

shifts the

y

-coordinate by

k

times the

x

-coordinate. Both are triangular with determinant

1

— area-preserving and orientation-preserving — but not orthogonal: angles are distorted.

Foundation

(3 formulas)

Eigenvalue Definition

A\mathbf{v} = \lambda\mathbf{v}, \quad \mathbf{v} \neq \mathbf{0}

Definition

See details

explanationconditionsnotationrelated formulasrelated definitions

For a square matrix

A

, a nonzero vector

\mathbf{v}

is an eigenvector if

A\mathbf{v}

is a scalar multiple of

\mathbf{v}

. The scalar

\lambda

is the corresponding eigenvalue. Geometrically, eigenvectors are the directions that the linear transformation

\mathbf{x} \mapsto A\mathbf{x}

preserves — it stretches, compresses, or reverses along these directions without deflecting them.

Characteristic Equation

\det(A - \lambda I) = 0

From Eigenvectors to the Determinant Condition

See details

explanationconditionsderivationrelated formulasrelated definitions

The eigenvalue equation

A\mathbf{v} = \lambda\mathbf{v}

rearranges to

(A - \lambda I)\mathbf{v} = \mathbf{0}

, a homogeneous system. Nontrivial solutions exist iff

A - \lambda I

is singular — iff its determinant vanishes. This converts the geometric eigenvalue question ("which directions are preserved?") into the algebraic one ("which

\lambda

make this determinant zero?"). The eigenvalues are exactly the roots.

Eigenspace

E_\lambda = \text{Null}(A - \lambda I) = \{\mathbf{v} : A\mathbf{v} = \lambda\mathbf{v}\}

Eigenspaces

See details

explanationconditionsrelated formulasrelated definitions

The eigenspace of eigenvalue

\lambda

is the set of all vectors

\mathbf{v}

satisfying

A\mathbf{v} = \lambda\mathbf{v}

, including the zero vector. It equals the null space of

A - \lambda I

and is a subspace of

\mathbb{R}^n

. Any linear combination of eigenvectors sharing the same eigenvalue is again an eigenvector for that eigenvalue (or zero). The dimension of

E_\lambda

is the geometric multiplicity

m_g(\lambda)

Characteristic Polynomial

(3 formulas)

Characteristic Polynomial

p(\lambda) = \det(A - \lambda I)

The Characteristic Polynomial

See details

explanationconditionsvariantsrelated formulasrelated definitions

The characteristic polynomial of an

n \times n

matrix is a polynomial of degree

n

\lambda

. Its roots are exactly the eigenvalues. The polynomial packages the entire eigenvalue spectrum into a single algebraic expression — its leading term is

(-1)^n \lambda^n

, its constant term is

p(0) = \det(A)

, and the coefficient of

\lambda^{n-1}

(-1)^{n-1}\text{tr}(A)

Characteristic Polynomial 2x2

p(\lambda) = \lambda^2 - \text{tr}(A)\,\lambda + \det(A)

Computing the Characteristic Polynomial: 2×2

See details

explanationderivationrelated formulasrelated definitions

For a

2 \times 2

matrix, the characteristic polynomial collapses to a quadratic in trace and determinant. Eigenvalues follow from the quadratic formula:

\lambda = \bigl(\text{tr}(A) \pm \sqrt{\text{tr}(A)^2 - 4\det(A)}\bigr)/2

. The discriminant classifies the eigenvalue type — see Discriminant Classification 2x2.

Cayley-Hamilton

p(A) = O

The Cayley-Hamilton Theorem

See details

explanationvariantsderivationrelated formulasrelated definitions

Every square matrix satisfies its own characteristic polynomial. Substituting

A

for

\lambda

p(\lambda)

(with constants multiplied by

I

) yields the zero matrix. The theorem provides a recurrence reducing any power

A^k

with

k \geq n

to a polynomial in

A

of degree at most

n-1

, and expresses

A^{-1}

(when it exists) as a polynomial in

A

Multiplicities

(2 formulas)

Multiplicity Inequality

1 \leq m_g(\lambda) \leq m_a(\lambda)

Algebraic and Geometric Multiplicity

See details

explanationconditionsvariantsrelated formulasrelated definitions

Every eigenvalue has two multiplicities: algebraic (

m_a

, the root multiplicity in the characteristic polynomial) and geometric (

m_g

, the dimension of the eigenspace). The geometric multiplicity is at least

1

(eigenspaces contain a nonzero eigenvector by definition) and at most the algebraic multiplicity. Equality across all eigenvalues is the diagonalizability condition.

Independence of Distinct Eigenvectors

\lambda_1, \ldots, \lambda_k \text{ distinct} \Rightarrow \{\mathbf{v}_1, \ldots, \mathbf{v}_k\} \text{ linearly independent}

Independence of Eigenvectors

See details

explanationderivationrelated formulasrelated definitions

Eigenvectors corresponding to distinct eigenvalues are always linearly independent. This is the key structural fact making distinct-eigenvalue matrices automatically diagonalizable. The proof is by induction: from

\sum c_i \mathbf{v}_i = \mathbf{0}

, multiply by

A

and subtract

\lambda_k

times the original to eliminate

\mathbf{v}_k

, then apply the induction hypothesis.

Eigenvalue Algebra

(4 formulas)

Eigenvalue of Power

A\mathbf{v} = \lambda\mathbf{v} \Rightarrow A^k\mathbf{v} = \lambda^k\mathbf{v}

Eigenvalues of Powers and Polynomials

See details

explanationconditionsderivationrelated formulasrelated definitions

Raising

A

to a power raises every eigenvalue to that power, while preserving the eigenvectors. The eigenvector basis is invariant under power operations — only the scaling factors change. Combined with diagonalization, this gives the cheap-power formula

A^k = PD^kP^{-1}

Eigenvalue of Inverse

A\mathbf{v} = \lambda\mathbf{v} \Rightarrow A^{-1}\mathbf{v} = \frac{1}{\lambda}\mathbf{v}

Eigenvalues of the Inverse

See details

explanationconditionsderivationrelated formulasrelated definitions

The eigenvalues of

A^{-1}

are the reciprocals of the eigenvalues of

A

, with the same eigenvectors. Invertibility of

A

guarantees

\lambda \neq 0

, so reciprocation is always defined.

Eigenvalue of Polynomial

q(A)\mathbf{v} = q(\lambda)\mathbf{v}

Eigenvalues of Powers and Polynomials

See details

explanationconditionsderivationvariantsrelated formulasrelated definitions

For any polynomial

q(t) = c_0 + c_1 t + \cdots + c_m t^m

, the matrix polynomial

q(A) = c_0 I + c_1 A + \cdots + c_m A^m

has eigenvalues

q(\lambda_i)

with the same eigenvectors. This generalizes the power and shift identities into a single law: eigenvalues transform by

q

while eigenvectors stay fixed.

Eigenvalue Shift

A\mathbf{v} = \lambda\mathbf{v} \Rightarrow (A + cI)\mathbf{v} = (\lambda + c)\mathbf{v}

Eigenvalue Shifting

See details

explanationderivationvariantsrelated formulasrelated definitions

Adding

cI

A

shifts every eigenvalue by

c

while leaving eigenvectors unchanged. Useful for making a matrix positive definite (shifting all eigenvalues positive) or for shifting a known eigenvalue to zero (the eigenvalue equation

(A - \lambda_0 I)\mathbf{v} = \mathbf{0}

is exactly this construction).

Diagonalizability

(2 formulas)

Diagonalizability Condition

A \text{ diagonalizable} \iff m_g(\lambda) = m_a(\lambda) \text{ for every eigenvalue } \lambda

When Is a Matrix Diagonalizable?

See details

explanationconditionsvariantsrelated formulasrelated definitions

A matrix is diagonalizable iff its eigenvectors span

\mathbb{R}^n

— equivalently, iff each eigenvalue's geometric multiplicity matches its algebraic multiplicity. When the geometric multiplicity falls short for any eigenvalue, the matrix is defective: there are not enough eigenvectors to form a basis, and the best achievable form is the Jordan normal form rather than a diagonal matrix.

Distinct Eigenvalues Imply Diagonalizable

A \text{ has } n \text{ distinct eigenvalues} \Rightarrow A \text{ is diagonalizable}

When Is a Matrix Diagonalizable?

See details

explanationconditionsderivationrelated formulasrelated definitions

n \times n

matrix with

n

distinct eigenvalues is automatically diagonalizable. By Independence of Distinct Eigenvectors, the

n

eigenvectors are linearly independent, providing exactly enough vectors for an eigenvector basis. The converse fails — diagonalizable matrices can have repeated eigenvalues (e.g.,

cI

Special Spectra

(1 formula)

Special Matrix Eigenvalue Restrictions

\begin{aligned} \text{symmetric (real)}: \;& \lambda \in \mathbb{R} \\ \text{skew-symmetric (real)}: \;& \lambda = 0 \text{ or } \lambda \in i\mathbb{R} \\ \text{orthogonal}: \;& |\lambda| = 1 \\ \text{idempotent}: \;& \lambda \in \{0, 1\} \\ \text{nilpotent}: \;& \lambda = 0 \\ \text{involutory}: \;& \lambda \in \{-1, +1\} \\ \text{positive definite}: \;& \lambda > 0 \end{aligned}

\begin{aligned} \text{symmetric (real)}: \;& \lambda \in \mathbb{R} \\ \text{skew-symmetric (real)}: \;& \lambda = 0 \text{ or } \lambda \in i\mathbb{R} \\ \text{orthogonal}: \;& |\lambda| = 1 \\ \text{idempotent}: \;& \lambda \in \{0, 1\} \\ \text{nilpotent}: \;& \lambda = 0 \\ \text{involutory}: \;& \lambda \in \{-1, +1\} \\ \text{positive definite}: \;& \lambda > 0 \end{aligned}

Eigenvalues of Special Matrix Types

See details

explanationconditionsvariantsrelated formulasrelated definitions

Structural properties of a matrix restrict its eigenvalue spectrum. Many follow directly from the defining equation: if

A^2 = A

then

\lambda^2 = \lambda

, forcing

\lambda \in \{0, 1\}

; if

A^k = O

then

\lambda^k = 0

, forcing

\lambda = 0

; if

A^2 = I

then

\lambda^2 = 1

, forcing

\lambda = \pm 1

. For symmetric matrices, the real-eigenvalue property is deeper (Spectral Theorem). For orthogonal matrices,

\|Q\mathbf{v}\| = \|\mathbf{v}\|

forces

|\lambda| = 1

Spectral

(2 formulas)

Spectral Theorem

A = A^T \Rightarrow A = Q D Q^T, \quad Q^T Q = I, \quad D = \text{diag}(\lambda_1, \ldots, \lambda_n) \in \mathbb{R}^{n \times n}

The Spectral Theorem for Symmetric Matrices

See details

explanationconditionsvariantsrelated formulasrelated definitions

Every real symmetric matrix is orthogonally diagonalizable: the diagonalizing matrix

Q

can be chosen with orthonormal columns of eigenvectors, and the diagonal

D

contains real eigenvalues. This is stronger than ordinary diagonalizability — it guarantees real eigenvalues, mutually orthogonal eigenvectors, and a numerically stable diagonalization (since

Q^{-1} = Q^T

Spectral Decomposition

A = \sum_{i=1}^{n} \lambda_i \, \mathbf{q}_i \mathbf{q}_i^T

The Spectral Theorem for Symmetric Matrices

See details

explanationconditionsderivationvariantsrelated formulasrelated definitions

The spectral theorem

A = QDQ^T

expands into a sum of rank-one projections: each

\mathbf{q}_i\mathbf{q}_i^T

is the orthogonal projection onto the eigenspace direction

\mathbf{q}_i

, weighted by eigenvalue

\lambda_i

. This decomposition makes the matrix's action transparent — every input is projected onto each eigendirection, scaled by the corresponding eigenvalue, and summed.

Applications

(1 formula)

Matrix Exponential

e^{At} = P\, e^{Dt}\, P^{-1} = P \,\text{diag}(e^{\lambda_1 t}, \ldots, e^{\lambda_n t})\, P^{-1}

Matrix Exponential

See details

explanationconditionsvariantsrelated formulasrelated definitions

For a diagonalizable matrix, the matrix exponential reduces to scalar exponentials of the eigenvalues. The solution to

\mathbf{x}' = A\mathbf{x}

with

\mathbf{x}(0) = \mathbf{x}_0

\mathbf{x}(t) = e^{At}\mathbf{x}_0

, generalizing the scalar

x(t) = e^{at}x_0

. The general defining series

e^{At} = \sum_{k=0}^\infty (At)^k/k!

converges for any square matrix, but the explicit formula above only applies in the diagonalizable case.

Complex

(3 formulas)

Complex Conjugate Pairs

A \in \mathbb{R}^{n \times n}, \;\; A\mathbf{v} = \lambda\mathbf{v} \Rightarrow A\bar{\mathbf{v}} = \bar{\lambda}\bar{\mathbf{v}}

Conjugate Pairs

See details

explanationderivationvariantsrelated formulasrelated definitions

Complex eigenvalues of a real matrix always come in conjugate pairs, with conjugate eigenvectors. The proof leverages that the characteristic polynomial has real coefficients: complex roots of a real polynomial pair with their conjugates. One consequence: every odd-dimensional real matrix has at least one real eigenvalue.

Discriminant Classification 2x2

\Delta = \text{tr}(A)^2 - 4\det(A) \quad \begin{cases} \Delta > 0: & \text{two distinct real eigenvalues} \\ \Delta = 0: & \text{one repeated real eigenvalue} \\ \Delta < 0: & \text{complex conjugate pair} \end{cases}

\Delta = \text{tr}(A)^2 - 4\det(A) \quad \begin{cases} \Delta > 0: & \text{two distinct real eigenvalues} \\ \Delta = 0: & \text{one repeated real eigenvalue} \\ \Delta < 0: & \text{complex conjugate pair} \end{cases}

When Complex Eigenvalues Appear

See details

explanationderivationvariantsrelated formulasrelated definitions

For a

2 \times 2

real matrix, the discriminant of the characteristic polynomial

\lambda^2 - \text{tr}(A)\lambda + \det(A)

classifies the eigenvalue structure. Negative discriminant means no real direction is preserved — the transformation rotates rather than purely stretching.

Real Canonical Form 2x2

\lambda = a \pm bi \Rightarrow P^{-1}AP = \begin{pmatrix} a & -b \\ b & a \end{pmatrix} = r\begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix}

Real Canonical Form

See details

explanationconditionsvariantsrelated formulasrelated definitions

2 \times 2

real matrix with complex eigenvalues

a \pm bi

is similar (over

\mathbb{R}

) to a rotation-scaling matrix. The complex eigenvector

\mathbf{v} = \mathbf{u} + i\mathbf{w}

contributes its real and imaginary parts as columns of

P = [\mathbf{u} \;\; \mathbf{w}]

. The transformation rotates by

\theta = \arctan(b/a)

and scales by

r = \sqrt{a^2 + b^2}

Inner Product

(5 formulas)

Cauchy-Schwarz Inequality (General)

|\mathbf{u} \cdot \mathbf{v}| \leq \|\mathbf{u}\| \, \|\mathbf{v}\|

The Cauchy-Schwarz Inequality

See details

explanationderivationvariantsrelated formulasrelated definitions

The absolute value of the dot product never exceeds the product of the norms. Equality holds iff the vectors are parallel (one is a scalar multiple of the other). The inequality is what makes the angle formula

\cos\theta = (\mathbf{u}\cdot\mathbf{v})/(\|\mathbf{u}\|\|\mathbf{v}\|)

legitimate — it guarantees the right-hand side stays in

[-1, 1]

Triangle Inequality (Inner Product)

\|\mathbf{u} + \mathbf{v}\| \leq \|\mathbf{u}\| + \|\mathbf{v}\|

The Triangle Inequality

See details

explanationderivationvariantsrelated formulasrelated definitions

The length of a sum is bounded by the sum of lengths — one side of a triangle never exceeds the sum of the other two. Equality holds iff

\mathbf{u}

and

\mathbf{v}

point in the same direction (one is a non-negative scalar multiple of the other). The inequality is what makes the distance function a valid metric.

Pythagorean Theorem

\mathbf{u} \cdot \mathbf{v} = 0 \Rightarrow \|\mathbf{u} + \mathbf{v}\|^2 = \|\mathbf{u}\|^2 + \|\mathbf{v}\|^2

The Pythagorean Theorem

See details

explanationderivationvariantsrelated formulasrelated definitions

When two vectors are orthogonal, the squared length of their sum decomposes into the sum of squared lengths. This generalizes the elementary plane-geometry theorem to

\mathbb{R}^n

and to any inner product space. It is the underlying reason orthogonal decompositions are computationally clean: lengths split additively across perpendicular components.

Inner Product Axioms

\begin{aligned} \text{Symmetry:} \;& \langle\mathbf{u},\mathbf{v}\rangle = \langle\mathbf{v},\mathbf{u}\rangle \\ \text{Linearity:} \;& \langle c\mathbf{u}+d\mathbf{w},\mathbf{v}\rangle = c\langle\mathbf{u},\mathbf{v}\rangle + d\langle\mathbf{w},\mathbf{v}\rangle \\ \text{Positive definite:} \;& \langle\mathbf{v},\mathbf{v}\rangle > 0 \text{ for } \mathbf{v} \neq \mathbf{0} \end{aligned}

\begin{aligned} \text{Symmetry:} \;& \langle\mathbf{u},\mathbf{v}\rangle = \langle\mathbf{v},\mathbf{u}\rangle \\ \text{Linearity:} \;& \langle c\mathbf{u}+d\mathbf{w},\mathbf{v}\rangle = c\langle\mathbf{u},\mathbf{v}\rangle + d\langle\mathbf{w},\mathbf{v}\rangle \\ \text{Positive definite:} \;& \langle\mathbf{v},\mathbf{v}\rangle > 0 \text{ for } \mathbf{v} \neq \mathbf{0} \end{aligned}

General Inner Products

See details

explanationconditionsvariantsrelated formulasrelated definitions

An inner product on a vector space

V

is any function

\langle\cdot,\cdot\rangle: V \times V \to \mathbb{R}

satisfying these three axioms. The standard dot product is one example; weighted inner products

\langle\mathbf{u},\mathbf{v}\rangle = \mathbf{u}^T W \mathbf{v}

(with

W

symmetric positive definite),

L^2

function integrals, and the Frobenius matrix inner product are others. Every inner product induces a norm, distance, and notion of orthogonality.

Distance Formula (Inner Product)

d(\mathbf{u}, \mathbf{v}) = \|\mathbf{u} - \mathbf{v}\| = \sqrt{\sum_{i=1}^{n} (u_i - v_i)^2}

Distance

See details

explanationvariantsrelated formulasrelated definitions

The Euclidean distance between two vectors is the length of their difference. The function

d

satisfies the metric axioms: non-negativity with equality iff

\mathbf{u} = \mathbf{v}

, symmetry

d(\mathbf{u},\mathbf{v}) = d(\mathbf{v},\mathbf{u})

, and the triangle inequality

d(\mathbf{u},\mathbf{w}) \leq d(\mathbf{u},\mathbf{v}) + d(\mathbf{v},\mathbf{w})

Orthogonal Complement

(3 formulas)

Orthogonal Complement Definition

W^\perp = \{\mathbf{v} \in \mathbb{R}^n : \mathbf{v} \cdot \mathbf{w} = 0 \text{ for all } \mathbf{w} \in W\}

Orthogonal Complements

See details

explanationconditionsvariantsrelated formulasrelated definitions

The orthogonal complement of a subspace

W \subseteq \mathbb{R}^n

is the set of all vectors perpendicular to every vector in

W

. It is itself a subspace of

\mathbb{R}^n

. Taking the complement twice returns the original:

(W^\perp)^\perp = W

. The complement structure underlies projection, least squares, and the four fundamental subspaces.

Complement Dimension Sum

\dim(W) + \dim(W^\perp) = n

Orthogonal Complements

See details

explanationvariantsrelated formulasrelated definitions

For any subspace

W

\mathbb{R}^n

, the dimensions of

W

and its orthogonal complement add to the ambient dimension. Together they span all of

\mathbb{R}^n

as a direct sum

\mathbb{R}^n = W \oplus W^\perp

— every vector decomposes uniquely into a

W

-component and a

W^\perp

-component.

Orthogonal Decomposition (Subspace)

\mathbf{v} = \hat{\mathbf{v}} + \mathbf{z}, \quad \hat{\mathbf{v}} \in W, \quad \mathbf{z} \in W^\perp

The Orthogonal Decomposition

See details

explanationconditionsderivationrelated formulasrelated definitions

Every vector decomposes uniquely into a component in a chosen subspace

W

and a component in its orthogonal complement. The

W

-component is the orthogonal projection

\hat{\mathbf{v}} = \text{proj}_W\mathbf{v}

, the closest point in

W

\mathbf{v}

. The residual

\mathbf{z}

is perpendicular to all of

W

and equals

\mathbf{v}

minus its projection.

Orthogonal Sets

(4 formulas)

Orthogonal Set Independence

\mathbf{v}_i \cdot \mathbf{v}_j = 0 \;\; (i \neq j), \;\; \mathbf{v}_i \neq \mathbf{0} \;\Rightarrow\; \{\mathbf{v}_1, \ldots, \mathbf{v}_k\} \text{ linearly independent}

Orthogonal Sets Are Independent

See details

explanationderivationvariantsrelated formulasrelated definitions

Any orthogonal set of nonzero vectors is automatically linearly independent — no separate independence check is needed. This is one of the central reasons orthogonality simplifies linear algebra: orthogonal sets come pre-equipped with the independence property that general sets require effort to verify.

Orthonormal Set

\mathbf{q}_i \cdot \mathbf{q}_j = \delta_{ij} = \begin{cases} 1 & i = j \\ 0 & i \neq j \end{cases}

Orthonormal Sets

See details

explanationconditionsvariantsrelated formulasrelated definitions

An orthonormal set is an orthogonal set of unit vectors. The Kronecker delta

\delta_{ij}

packages both conditions: pairwise orthogonality (

i \neq j

entries vanish) and unit length (

i = j

entries equal

1

). Any orthogonal set normalizes to orthonormal by dividing each vector by its length.

Coordinates via Orthonormal Basis

\mathbf{x} = \sum_{i=1}^{n} (\mathbf{x} \cdot \mathbf{q}_i)\,\mathbf{q}_i, \qquad c_i = \mathbf{x} \cdot \mathbf{q}_i

Coordinates via Dot Products

See details

explanationderivationvariantsrelated formulasrelated definitions

For an orthonormal basis, the coordinate of

\mathbf{x}

along each basis vector is a single dot product. No system to solve, no matrix to invert — each

c_i

is computed independently. This is the defining computational advantage of orthonormal bases.

Parseval Identity

\|\mathbf{x}\|^2 = \sum_{i=1}^{n} (\mathbf{x} \cdot \mathbf{q}_i)^2 = \sum_{i=1}^{n} c_i^2

Parseval

See details

explanationderivationvariantsrelated formulasrelated definitions

In an orthonormal basis, the squared length of a vector equals the sum of squared coordinates. This is the Pythagorean theorem applied to the orthonormal expansion

\mathbf{x} = \sum c_i \mathbf{q}_i

— orthogonal components contribute additively to squared length, with no cross-terms.

Projection

(5 formulas)

Projection onto Subspace

\hat{\mathbf{b}} = A(A^TA)^{-1}A^T\,\mathbf{b}, \qquad P = A(A^TA)^{-1}A^T

Projection with an Arbitrary Basis

See details

explanationconditionsderivationvariantsrelated formulasrelated definitions

When the columns of

A

form a basis for a subspace

W

, the orthogonal projection of

\mathbf{b}

onto

W

is computed by the formula above. The matrix

P = A(A^TA)^{-1}A^T

is the projection matrix — it maps any vector to its projection. This is the general formula that works regardless of whether the basis is orthogonal.

Projection onto Orthonormal Basis

\text{proj}_W\,\mathbf{b} = \sum_{i=1}^{k} (\mathbf{q}_i \cdot \mathbf{b})\,\mathbf{q}_i

Projection with an Orthogonal Basis

See details

explanationconditionsderivationrelated formulasrelated definitions

When the basis

\{\mathbf{q}_1, \ldots, \mathbf{q}_k\}

for

W

is orthonormal, projection decomposes into

k

independent single-vector projections. Orthogonality eliminates cross-talk: each component is computed by one dot product, with no interference between basis vectors. This is the cleanest projection formula in linear algebra.

Orthonormal Columns Projection

P = QQ^T, \qquad Q^TQ = I_k

The Projection Matrix

See details

explanationconditionsvariantsrelated formulasrelated definitions

When the columns of

Q

form an orthonormal basis for

W

, the projection matrix onto

W

collapses to the outer product

QQ^T

. No inversion is required —

A^TA

becomes

I_k

and disappears from the general formula. This is the most numerically stable form of projection.

Projection Matrix Properties

P^2 = P, \qquad P^T = P

Properties of Orthogonal Projections

See details

explanationconditionsvariantsrelated formulasrelated definitions

An orthogonal projection matrix is idempotent and symmetric. Idempotence (

P^2 = P

) reflects that projecting twice gives the same result as projecting once — vectors already in

W

are fixed by

P

. Symmetry (

P^T = P

) is what makes the projection orthogonal rather than oblique: it forces the residual perpendicular to

W

, not merely transverse to it. Any matrix satisfying both conditions is an orthogonal projection onto some subspace.

Complementary Projection

P_{W^\perp} = I - P_W, \qquad P_W\mathbf{b} + (I - P_W)\mathbf{b} = \mathbf{b}

The Projection Matrix

See details

explanationderivationvariantsrelated formulasrelated definitions

P

projects onto

W

, then

I - P

projects onto

W^\perp

. The two projections together decompose every vector into its

W

-component and its perpendicular residual. The complementary projection inherits both defining properties:

(I - P)^2 = I - P

and

(I - P)^T = I - P

Gram-Schmidt

(1 formula)

Gram-Schmidt Process

\mathbf{u}_1 = \mathbf{v}_1, \qquad \mathbf{u}_j = \mathbf{v}_j - \sum_{i=1}^{j-1} \frac{\mathbf{u}_i \cdot \mathbf{v}_j}{\mathbf{u}_i \cdot \mathbf{u}_i}\,\mathbf{u}_i, \qquad \mathbf{q}_j = \frac{\mathbf{u}_j}{\|\mathbf{u}_j\|}

The Algorithm: General Case

See details

explanationconditionsvariantsrelated formulasrelated definitions

The Gram-Schmidt process converts any linearly independent set

\{\mathbf{v}_1, \ldots, \mathbf{v}_k\}

into an orthogonal set

\{\mathbf{u}_1, \ldots, \mathbf{u}_k\}

spanning the same subspace. At each step

j

\mathbf{v}_j

has its projections onto all previously computed orthogonal vectors subtracted, leaving only the component perpendicular to

\text{Span}\{\mathbf{u}_1, \ldots, \mathbf{u}_{j-1}\}

. Optionally each

\mathbf{u}_j

is normalized to produce an orthonormal set.

Least Squares

(3 formulas)

Normal Equations

A^TA\,\hat{\mathbf{x}} = A^T\mathbf{b}

The Normal Equations

See details

explanationconditionsderivationrelated formulasrelated definitions

When

A\mathbf{x} = \mathbf{b}

has no exact solution, the least-squares solution

\hat{\mathbf{x}}

— the minimizer of

\|A\mathbf{x} - \mathbf{b}\|^2

— satisfies this square

n \times n

system regardless of

A

's shape. The equations express the orthogonality condition: the residual

\mathbf{b} - A\hat{\mathbf{x}}

is perpendicular to every column of

A

Least-Squares Solution

\hat{\mathbf{x}} = (A^TA)^{-1} A^T \mathbf{b}, \qquad A^+ = (A^TA)^{-1}A^T

The Normal Equations

See details

explanationconditionsvariantsrelated formulasrelated definitions

When

A

has full column rank, the normal equations have the unique closed-form solution above. The matrix

A^+ = (A^TA)^{-1}A^T

is the left pseudoinverse of

A

— it satisfies

A^+A = I_n

. The projection of

\mathbf{b}

onto the column space is

A\hat{\mathbf{x}} = AA^+\mathbf{b} = P\mathbf{b}

Least-Squares via QR

A = QR \;\Rightarrow\; R\,\hat{\mathbf{x}} = Q^T\mathbf{b}

Least Squares via QR

See details

explanationconditionsderivationrelated formulasrelated definitions

Using the QR factorization

A = QR

, the normal equations reduce to an upper triangular system, solved by back substitution. This avoids forming

A^TA

explicitly, which is numerically critical: the condition number of

A^TA

is the square of the condition number of

A

, so direct normal-equations approaches amplify rounding errors. QR-based least squares is the standard algorithm in numerical software (LAPACK, NumPy, MATLAB).

LU

(4 formulas)

LU Decomposition

A = LU

What LU Decomposition Is

See details

explanationconditionsvariantsrelated formulasrelated definitions

Factors a square matrix

A

into a unit lower triangular

L

(ones on the diagonal, multipliers below) and an upper triangular

U

(the row echelon form).

U

records the result of Gaussian elimination;

L

records the multipliers used to produce it. The factorization captures the elimination process in reusable form: once computed, any system

A\mathbf{x} = \mathbf{b}

reduces to two triangular solves.

PA LU Partial Pivoting

PA = LU

Partial Pivoting: PA = LU

See details

explanationconditionsvariantsrelated formulasrelated definitions

When zero or near-zero pivots appear during elimination, row swaps are needed. Partial pivoting selects the largest absolute value in the current pivot column as the pivot at each step. The permutation matrix

P

records all swaps. This factorization exists for every invertible matrix and is the numerically stable default in software.

Determinant via LU

\det(A) = (-1)^s \prod_{i=1}^{n} u_{ii}

LU and the Determinant

See details

explanationconditionsvariantsrelated formulasrelated definitions

The determinant is a free byproduct of LU factorization: multiply the diagonal entries of

U

(the pivots) and account for the sign of the row permutation. Since

\det(L) = 1

(unit diagonal) and

\det(U) = \prod u_{ii}

(triangular),

\det(PA) = \det(L)\det(U) = \prod u_{ii}

, so

\det(A) = (-1)^s \prod u_{ii}

where

s

counts row swaps.

LU Solve Steps

A\mathbf{x} = \mathbf{b} \;\Longleftrightarrow\; \begin{cases} L\mathbf{y} = P\mathbf{b} & (\text{forward sub}) \\ U\mathbf{x} = \mathbf{y} & (\text{back sub}) \end{cases}

Solving Systems with LU

See details

explanationconditionsvariantsrelated formulasrelated definitions

Given

PA = LU

, solving

A\mathbf{x} = \mathbf{b}

reduces to two triangular solves: forward substitution down through

L

, then back substitution up through

U

. Each solve costs

O(n^2)

. Factor once at

\frac{2}{3}n^3

, then amortize:

k

systems with the same coefficient matrix cost

\frac{2}{3}n^3 + 2kn^2

Cholesky

(3 formulas)

Cholesky Decomposition

A = L L^T

What Cholesky Decomposition Is

See details

explanationconditionsvariantsrelated formulasrelated definitions

Factors a symmetric positive definite matrix

A

into a lower triangular

L

with strictly positive diagonal entries, times its own transpose.

L

is the unique Cholesky factor — the matrix "square root" of

A

in the sense

A = LL^T

. Exploits symmetry to halve the cost of LU and requires no pivoting since positive definiteness guarantees positive pivots throughout.

Cholesky Diagonal Formula

l_{jj} = \sqrt{a_{jj} - \sum_{k=1}^{j-1} l_{jk}^2}

The Algorithm

See details

explanationconditionsrelated formulasrelated definitions

The diagonal entries of the Cholesky factor are computed left-to-right. The

j

-th diagonal involves subtracting the squared entries already placed in row

j

from the diagonal entry of

A

, then taking the positive square root. Positive definiteness guarantees the argument under the root is strictly positive at every step.

Cholesky Off-Diagonal Formula

l_{ij} = \frac{1}{l_{jj}}\left( a_{ij} - \sum_{k=1}^{j-1} l_{ik} l_{jk} \right), \qquad i > j

The Algorithm

See details

explanationconditionsrelated formulasrelated definitions

After computing the diagonal

l_{jj}

, the entries below it in column

j

are computed by subtracting cross-terms involving previously computed factors and dividing by

l_{jj}

. This fills column

j

before moving to column

j+1

. Together with the diagonal formula, this completes the Cholesky algorithm.

QR

(3 formulas)

QR Decomposition

A = Q R

What QR Decomposition Is

See details

explanationconditionsvariantsrelated formulasrelated definitions

Factors an

m \times n

matrix

A

(with

m \geq n

and linearly independent columns) into

Q

with orthonormal columns and

R

upper triangular with positive diagonal. The columns of

Q

are an orthonormal basis for

\text{Col}(A)

;

R

records the coefficients expressing each column of

A

in that basis. Produced by Gram-Schmidt, Householder reflections, or Givens rotations.

QR Gram-Schmidt R Entries

R_{ij} = \mathbf{q}_i \cdot \mathbf{a}_j \;\; (i \leq j), \qquad R_{ij} = 0 \;\; (i > j), \qquad R_{jj} = \|\mathbf{u}_j\|

QR via Gram-Schmidt

See details

explanationderivationrelated formulasrelated definitions

When Gram-Schmidt is applied to the columns of

A

, the entries of

R

are the dot products computed along the way. The upper triangular structure reflects the sequential nature of orthogonalization:

\mathbf{a}_j

's projection onto

\mathbf{q}_i

is zero for

i > j

because that direction has not yet been introduced. The diagonal entry

R_{jj} = \|\mathbf{u}_j\|

is the norm of the unnormalized Gram-Schmidt vector — always positive, making

R

unique.

QR Algorithm for Eigenvalues

A_k = Q_k R_k, \qquad A_{k+1} = R_k Q_k

The QR Algorithm for Eigenvalues

See details

explanationconditionsvariantsrelated formulasrelated definitions

The standard algorithm for computing eigenvalues of general matrices. Starting from

A_0 = A

, each iteration factors

A_k = Q_k R_k

and forms

A_{k+1} = R_k Q_k

. Since

A_{k+1} = Q_k^T A_k Q_k

, each step is a similarity transformation preserving eigenvalues. Under mild conditions,

A_k

converges to an upper triangular matrix with eigenvalues on the diagonal — without ever forming the characteristic polynomial.

SVD

(10 formulas)

SVD

A = U \Sigma V^T

What the SVD Is

See details

explanationconditionsvariantsrelated formulasrelated definitions

The most general matrix factorization. Every

m \times n

matrix — any shape, any rank — factors as the product of an

m \times m

orthogonal

U

, an

m \times n

diagonal

\Sigma

with non-negative entries, and an

n \times n

orthogonal

V

(transposed). Geometrically: every linear transformation is a rotation, followed by axis-aligned scaling, followed by another rotation.

Singular Values

\sigma_i = \sqrt{\lambda_i(A^TA)} = \sqrt{\lambda_i(AA^T)}

Singular Values

See details

explanationconditionsvariantsrelated formulasrelated definitions

The singular values of

A

are the square roots of the eigenvalues of

A^TA

(equivalently

AA^T

). Since

A^TA

is symmetric positive semi-definite, its eigenvalues are non-negative, making the singular values real and non-negative. They measure the stretching factors of the linear transformation along its principal axes:

\sigma_1 = \max_{\|\mathbf{x}\|=1}\|A\mathbf{x}\|

is the maximum stretching.

SVD Rank

\text{rank}(A) = \#\{i : \sigma_i > 0\}

Singular Values

See details

explanationvariantsrelated formulasrelated definitions

The rank of

A

equals the number of nonzero singular values. This is the most numerically stable rank determination method: row reduction can miscount rank when small numerical errors push true zeros to small nonzeros, but SVD with a tolerance gives a robust effective rank. Standard practice: count singular values above a tolerance

\epsilon \sigma_1

SVD Outer Product Form

A = \sum_{i=1}^{r} \sigma_i \, \mathbf{u}_i \mathbf{v}_i^T

The Outer Product Form

See details

explanationderivationvariantsrelated formulasrelated definitions

The SVD expands as a sum of

r = \text{rank}(A)

rank-one matrices, each weighted by a singular value. Terms are naturally ordered by importance — the largest

\sigma_i

contributes most. Truncating at

k

terms gives the best rank-

k

approximation (Eckart-Young). This form underlies image compression, noise reduction, latent semantic analysis, and most matrix approximation methods.

Moore-Penrose Pseudoinverse

A^+ = V \Sigma^+ U^T

The Pseudoinverse

See details

explanationconditionsvariantsrelated formulasrelated definitions

The Moore-Penrose pseudoinverse generalizes matrix inversion to any matrix, including rectangular and rank-deficient ones.

\Sigma^+

is formed by reciprocating each nonzero singular value and transposing the shape. The pseudoinverse satisfies four defining (Penrose) conditions:

AA^+A = A

A^+AA^+ = A^+

(AA^+)^T = AA^+

(A^+A)^T = A^+A

. These uniquely determine

A^+

Eckart-Young Low-Rank Approximation

A_k = \sum_{i=1}^{k} \sigma_i \mathbf{u}_i \mathbf{v}_i^T, \qquad \|A - A_k\|_2 = \sigma_{k+1}, \quad \|A - A_k\|_F = \sqrt{\sum_{i=k+1}^{r}\sigma_i^2}

Low-Rank Approximation

See details

explanationconditionsvariantsrelated formulasrelated definitions

The Eckart-Young-Mirsky theorem: among all matrices of rank at most

k

, the truncated SVD

A_k

is closest to

A

in both the operator norm and the Frobenius norm. The approximation error equals the first discarded singular value (operator norm) or the root-sum-of-squares of discarded singular values (Frobenius norm). This is the mathematical foundation of dimensionality reduction.

Condition Number

\kappa(A) = \frac{\sigma_1}{\sigma_r}

SVD and Norms

See details

explanationconditionsvariantsrelated formulasrelated definitions

The condition number measures how sensitive the linear system

A\mathbf{x} = \mathbf{b}

is to perturbations in

\mathbf{b}

. A matrix with

\kappa(A) = 10^k

can lose roughly

k

digits of accuracy in floating-point arithmetic.

\kappa = 1

characterizes orthogonal matrices (perfectly conditioned);

\kappa = \infty

means singular. The ratio of largest to smallest nonzero singular value quantifies the geometric distortion of the transformation.

Operator Norm

\|A\|_2 = \sigma_1 = \max_{\|\mathbf{x}\|=1} \|A\mathbf{x}\|

SVD and Norms

See details

explanationconditionsvariantsrelated formulasrelated definitions

The operator (spectral,

\ell_2

) norm of a matrix is its largest singular value. Geometrically, it is the maximum stretching factor: the largest length

A

can produce from a unit input. Equivalently, the largest eigenvalue of

A^TA

in absolute value.

Frobenius Norm via Singular Values

\|A\|_F = \sqrt{\sum_{i=1}^{r} \sigma_i^2}

SVD and Norms

See details

explanationderivationvariantsrelated formulasrelated definitions

The Frobenius norm — equal to

\sqrt{\sum_{i,j}|a_{ij}|^2} = \sqrt{\text{tr}(A^TA)}

— has a clean SVD characterization as the root-sum-of-squares of singular values. This connects the entrywise "total energy" of a matrix to its spectral content.

SVD Four Fundamental Subspaces

\begin{aligned} \text{Col}(A) &= \text{Span}\{\mathbf{u}_1, \ldots, \mathbf{u}_r\} \\ \text{Null}(A^T) &= \text{Span}\{\mathbf{u}_{r+1}, \ldots, \mathbf{u}_m\} \\ \text{Row}(A) &= \text{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_r\} \\ \text{Null}(A) &= \text{Span}\{\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\} \end{aligned}

\begin{aligned} \text{Col}(A) &= \text{Span}\{\mathbf{u}_1, \ldots, \mathbf{u}_r\} \\ \text{Null}(A^T) &= \text{Span}\{\mathbf{u}_{r+1}, \ldots, \mathbf{u}_m\} \\ \text{Row}(A) &= \text{Span}\{\mathbf{v}_1, \ldots, \mathbf{v}_r\} \\ \text{Null}(A) &= \text{Span}\{\mathbf{v}_{r+1}, \ldots, \mathbf{v}_n\} \end{aligned}

SVD and the Four Fundamental Subspaces

See details

explanationconditionsrelated formulasrelated definitions

The SVD simultaneously delivers orthonormal bases for all four fundamental subspaces of any matrix. The first

r

left singular vectors (columns of

U

) span the column space; the remaining

m-r

span its orthogonal complement (left null space). The first

r

right singular vectors (columns of

V

) span the row space; the remaining

n-r

span the null space. No other factorization provides all four bases at once, and all four are guaranteed orthonormal.

Cross-Decomposition

(1 formula)

Quadratic Form Diagonalization

\mathbf{x}^T A \mathbf{x} = \mathbf{y}^T D \mathbf{y} = \sum_{i=1}^{n} \lambda_i y_i^2, \qquad \mathbf{x} = Q\mathbf{y}

Quadratic Forms

See details

explanationderivationvariantsrelated formulasrelated definitions

For symmetric

A

with spectral decomposition

A = QDQ^T

, the change of variables

\mathbf{x} = Q\mathbf{y}

diagonalizes the quadratic form

\mathbf{x}^TA\mathbf{x}

into a sum of independent squared terms weighted by eigenvalues. The eigenvectors define the principal axes of the quadratic surface; eigenvalue signs classify the form: all