Not all measures of central tendency are based on averaging. Some describe position rather than magnitude.
This page introduces the median as the value that divides a distribution into two equally weighted halves. It focuses on how the median is defined through ordering and cumulative probability, and why it remains stable even when extreme values distort other measures of center.
Definition and Concept
The median is the value that splits a probability distribution exactly in half. It's the point where half the probability lies below and half lies above.
What is the Median?
For any probability distribution, the median is the 50th percentile—the value m that divides the total probability into two equal parts:
• Discrete distributions: The value m where cumulative probability reaches or exceeds 0.5 from both directions • Continuous distributions: The unique value m where P(X≤m)=0.5
If you draw random variables from the distribution repeatedly, half will fall below the median and half will fall above it.
Median as a Measure of Central Tendency
The median is one of three primary measures describing where a distribution centers, alongside the mean and mode.
Unlike the mean, which balances all values through weighted averaging, the median simply identifies the middle position. It cares only about ranking values, not their magnitudes—making it resistant to extreme observations that would distort the mean.
Why the Median Matters
The median provides crucial insight into distribution structure:
• Where is the center? The median marks the probability midpoint • How symmetric is the distribution? Comparing median to mean reveals skewness • Are there outliers? The median stays stable when extreme values appear
For income data, housing prices, or any measurement prone to outliers, the median often represents the "typical" value better than the mean does.
Median vs Other Measures
The median behaves distinctly from mean and mode:
• Robustness: Extreme values beyond the 50% threshold have zero influence • Always unique for continuous distributions: Unlike the mode, which can be multiple or absent • May differ from mean: Skewed distributions separate median and mean substantially • Interpretability: Represents an achievable middle value, not an abstract balance point
The median complements mean and mode by revealing the distribution's center through probability division rather than probability weighting or probability concentration.
Median for Discrete Distributions
For discrete distributions, the median is the smallest value where cumulative probability reaches or crosses the halfway point.
Definition
The median of a discrete random variableX is the smallest value m satisfying both conditions:
P(X≤m)≥0.5andP(X≥m)≥0.5
This ensures that at least half the probability lies below (or at) m, and at least half lies above (or at) m.
Why Two Conditions?
The discrete nature creates jumps in the cumulative distribution function. A single value might satisfy one condition but not the other.
The two-condition definition guarantees the median genuinely divides probability, even when the CDF jumps past 0.5 without landing exactly on it.
How to Find the Median
Unlike expected value, there's no universal formula. Find the median through cumulative probability:
1. Calculate the CDF: F(k)=P(X≤k) for each value in the support 2. Identify where F(k) first reaches or exceeds 0.5 3. Verify both conditions hold at that value 4. If multiple consecutive values satisfy both conditions, any of them qualifies as a median
Median: Approximation: median ≈λ+31−λ0.02 for large λ
For small λ, numerical evaluation of the CDF is necessary.
Non-Uniqueness
Discrete distributions can have multiple medians when the CDF jumps from below 0.5 to above 0.5 in a single step, landing on exactly 0.5 at the jump point.
Example: Discrete Uniform on {1,2,3,4} has CDF values 0.25,0.5,0.75,1.0. Both 2 and 3 satisfy the median conditions. Either value is a valid median.
Visual Identification
On a cumulative probability plot, the median is where the step function first touches or crosses the horizontal line at 0.5. The discrete jumps make this crossing potentially ambiguous.
Key Properties
• The median always lies within the support • Finding it requires evaluating the CDF, not the PMF directly • Multiple values may qualify as medians in certain cases • Unlike the mode, the median depends on cumulative structure, not individual probabilities • Changing tail probabilities can shift the median, even if the most probable value stays fixed
The median equals the location parameter. The mean does not exist for this distribution.
Lognormal Distribution with parameters μ,σ
Median: m=eμ
The median of the lognormal equals the exponential of the underlying normal's mean.
Triangular Distribution on [a,c,b] with mode c
Median depends on the position of mode c:
If c≥2a+b: median =a+2(b−a)(c−a)
If c<2a+b: median =b−2(b−a)(b−c)
Uniqueness
Continuous distributions with strictly increasing CDFs have exactly one median. The smoothness of the CDF eliminates the ambiguity present in discrete cases.
If the CDF has flat regions (constant over an interval), any value in that interval satisfies the median definition, creating non-uniqueness.
Relationship to Density
The median need not coincide with the peak of the PDF. Skewed distributions separate the median from the mode.
The median divides probability mass equally, while the mode identifies maximum density. These are distinct concepts that align only in symmetric distributions.
Visual Identification
On a density curve, the median is the vertical line that splits the area under the curve into two equal parts. For symmetric distributions, this line passes through the peak. For skewed distributions, the median sits between the mode and mean.
Key Properties
• Continuous distributions typically have a unique median • The median can be found by solving F(m)=0.5 or using F−1(0.5) • Unlike discrete cases, ties are impossible due to the smooth nature of the CDF • The median is always in the interior of the support for distributions with unbounded support • For bounded support, the median may approach but typically doesn't reach the boundaries
Discrete vs Continuous Median
The median behaves differently for discrete and continuous distributions due to fundamental differences in how probability distributes.
Discrete distributions: Multiple values may satisfy the median conditions when the CDF jumps from below 0.5 to above 0.5 in a single step.
Example: Discrete uniform on {1,2,3,4} has both 2 and 3 as valid medians.
Definition Complexity
Continuous case: Simple condition F(m)=0.5
Discrete case: Requires both P(X≤m)≥0.5 and P(X≥m)≥0.5 to handle probability jumps properly
The two-condition requirement for discrete distributions ensures the median genuinely divides probability despite the CDF's discontinuities.
Computational Methods
Continuous: Solve F(m)=0.5 analytically or use inverse CDF F−1(0.5)
Discrete: Evaluate the CDF at each support value, find where cumulative probability first reaches 0.5, verify both conditions
Continuous distributions often have closed-form median expressions. Discrete distributions typically require numerical evaluation or lookup tables.
Median as an Interval
Discrete: When multiple values satisfy the conditions, the median is technically an interval rather than a single point
Continuous: Always a single point (except for pathological cases with flat CDF regions)
In practice, discrete medians are reported as the smallest qualifying value, even when an interval exists.
Visual Differences
Continuous: The PDF curve shows the median as the vertical line splitting area equally
Discrete: The PMF bar chart requires examining cumulative probabilities to locate the median
Identifying the median visually is more intuitive for continuous distributions than discrete ones.
Impact on Calculations
Continuous: Integration yields exact median through ∫−∞mf(x)dx=0.5
Discrete: Summation requires careful handling of boundary cases: ∑k≤mP(X=k)≥0.5
The discrete case demands attention to whether inequalities are strict or non-strict.
Relationship to Other Measures
Both types: Median can differ from mean and mode in skewed distributions
Continuous: The separation is smooth and predictable based on distribution shape
Discrete: The separation can be irregular due to probability concentration at specific points
Why Continuous is Simpler
The smoothness of continuous CDFs eliminates ambiguity. Every strictly increasing CDF crosses 0.5 at exactly one location, making median identification straightforward.
Discrete distributions introduce complexity through jumps, ties, and the possibility of no value landing precisely at the 50% cumulative probability mark.
Median, Mean, and Mode Compared
Three measures describe where distributions center: median, mean, and mode. Each reveals different structural features.
Quick Definitions
Median: The 50th percentile that divides total probability equally
Mean: Weighted balance point calculated from all values
Mode: Peak location—where probability or density reaches maximum
Symmetric Distributions
When distributions mirror themselves around a center point, all three measures collapse to the same value.
The ordering reverses—mean pulled left, mode anchored at the right peak.
Comparison Table
Comparing Median, Mean, and Mode
Feature
Median
Mean
Mode
Definition
Value at 50th percentile
Probability-weighted average
Value with highest probability
Calculation
Find cumulative 0.5 point
Weight all values
Find maximum
Uniqueness
Single value (continuous)
Single value
Can have multiple
Outlier impact
None
Strong
None
Data type
Numerical only
Numerical only
Any (including categorical)
Interpretation
Middle value
Average outcome
Most likely value
Skewness indication
Between mode and mean
Pulled by tail
At peak
,
Robustness Differences
Mean: Vulnerable. One extreme observation can shift it substantially.
Median: Resistant. Values beyond the 50% threshold have no influence.
Mode: Immune. Tail behavior irrelevant unless it creates a new peak.
Income data illustrates this: billionaires inflate the mean drastically while leaving median and mode nearly unchanged.
Selection Criteria
Choose median for: • Skewed distributions • Data contaminated by outliers • Representing a "central" value that's actually achievable • Reporting typical values when extremes distort the mean
Choose mean for: • Symmetric data without extreme values • Leveraging mathematical properties (additivity, scaling) • Incorporating all observations equally
Choose mode for: • Categorical outcomes (colors, brands, types) • Identifying the most frequent occurrence • Detecting multiple concentration points
Spatial Relationships
Symmetric case: All three occupy the same point at distribution center.
Right-skewed case: Mode sits at the left peak, median slightly right, mean furthest right chasing the tail.
Left-skewed case: Reversed ordering with mean leftmost, mode rightmost.
The median always falls between the mode and mean in skewed distributions, acting as a compromise measure that balances probability division against probability concentration.
Properties of the Median
The median exhibits specific mathematical behaviors that distinguish it from other central tendency measures.
Minimizes Absolute Deviation
The median minimizes the expected value of absolute deviation:
median(X)=argmminE[∣X−m∣]
Among all possible values m, the median produces the smallest average absolute distance from X. This contrasts with the mean, which minimizes squared deviations: E[(X−μ)2].
Transformation Under Linear Operations
For a random variableX with median m, consider the transformation Y=aX+b where a=0.
The median of Y is:
median(Y)=a⋅median(X)+b
Linear transformations shift and scale the median predictably. Multiply by a, add b—the middle point moves accordingly.
Example: If X has median 4, then Y=3X−2 has median 3(4)−2=10.
Invariance Under Monotonic Transformations
For strictly monotonic functions g (always increasing or always decreasing):
median(g(X))=g(median(X))
If g is strictly increasing, it preserves the ordering of values, so the 50th percentile stays at the same relative position.
Example: If X has median 5, then Y=X2 has median 25 (assuming X>0).
This property holds for the median and mode but fails for the mean.
Robustness to Outliers
The median completely ignores values beyond the 50th percentile threshold. Extreme observations in either tail have zero impact on the median as long as they don't cross the middle rank.
Change every value above the median by any amount—the median stays fixed as long as the cumulative probability structure at the 50th percentile remains unchanged.
This makes the median ideal for data with contamination or measurement errors in the tails.
Uniqueness for Continuous Distributions
Continuous distributions with strictly increasing CDFs have exactly one median. The smoothness eliminates ambiguity.
Compute CDF values for k=0,1,2,…,10. The CDF reaches F(5)=0.623≥0.5, and P(X≥5)=0.623≥0.5.
Median = 5.
Shortcut for known distributions: Many standard distributions have analytical approximations or tables: • Geometric: median = ⌈ln(1−p)−ln(2)⌉ • Poisson: median ≈λ for large λ
Rearrange the equation to isolate m and solve directly.
Step 4: Apply numerical methods if needed
When no closed form exists, use root-finding algorithms: • Bisection method • Newton-Raphson method • Built-in quantile functions in statistical software
No calculation needed for symmetric distributions.
Using Statistical Software
Modern software provides built-in median functions.
These functions implement the inverse CDF method automatically.
From Sample Data
When working with observed data rather than theoretical distributions:
Step 1: Sort the data
Arrange values in ascending order: x1≤x2≤⋯≤xn.
Step 2: Find the middle
For odd sample size n: median = x(n+1)/2
For even sample size n: median = 2xn/2+xn/2+1
Example: Data {3,7,1,9,5}
Sorted: {1,3,5,7,9}
Median = 5 (middle value).
Example: Data {3,7,1,9}
Sorted: {1,3,7,9}
Median = 23+7=5.
Visual Methods
For quick estimation:
Discrete: Examine the cumulative probability plot, find where steps cross 0.5
Continuous: Locate the area split point on the PDF curve where left area equals right area
Visual methods provide intuition but lack precision for formal analysis.
Median as the 50th Percentile
The median is the 50th percentile of a distribution—the value below which 50% of the probability lies.
Percentiles and Quantiles
A percentile divides a distribution at a specific cumulative probability. The p-th percentile is the value xp where:
P(X≤xp)=100p
The median is the special case where p=50:
P(X≤m)=0.5
Quantiles use fractional notation instead of percentages. The 0.5-quantile equals the 50th percentile equals the median.
Relationship to Quartiles
Quartiles divide the distribution into four equal parts:
• First quartile Q1 (25th percentile): P(X≤Q1)=0.25 • Second quartile Q2 (50th percentile): P(X≤Q2)=0.5 — this is the median • Third quartile Q3 (75th percentile): P(X≤Q3)=0.75
The median sits at the center of the quartile structure, with equal probability mass on each side.
Five-Number Summary
The median anchors the five-number summary used in descriptive statistics:
Minimum, Q1, Median, Q3, Maximum
This summary captures distribution shape through five key positions. Box plots visualize this summary, displaying the median as the central line inside the box.
Interquartile Range
The interquartile range (IQR) measures spread using quartiles:
IQR=Q3−Q1
This range contains the middle 50% of the distribution, centered at the median. The IQR provides a robust measure of spread that ignores extreme values, complementing the median's role as a robust measure of center.
General Percentile Formula
For continuous distributions, the p-th percentile solves:
For discrete distributions, the p-th percentile is the smallest value x where:
P(X≤x)≥p
Extension to Other Percentiles
Beyond quartiles, distributions can be divided into:
• Deciles: 10th, 20th, ..., 90th percentiles (divide into 10 parts) • Percentiles: 1st through 99th percentiles (divide into 100 parts) • Any quantile: solve F(x)=q for any 0<q<1
The median generalizes to arbitrary probability splits. Just as the median splits at 0.5, other percentiles split at different thresholds.
Why the 50th Percentile Matters
The 50th percentile is unique among percentiles:
• Divides probability exactly in half • Minimizes absolute deviation E[∣X−m∣] • Robust to extreme values in either tail • Natural midpoint for symmetric distributions
Other percentiles describe spread or tail behavior, but the median describes central location through probability balance.
Computing Other Percentiles
The same methods used to find the median apply to any percentile:
Continuous: Solve F(xp)=p or use xp=F−1(p)
Discrete: Find smallest x where P(X≤x)≥p
Statistical software provides quantile functions: qnorm(p) in R, ppf(p) in Python's scipy.stats.
Median vs Mean in Percentile Context
The mean is not a percentile—it's a weighted average, not a cumulative probability threshold.
The median answers "what value splits the distribution?" The mean answers "what is the balance point?" These are fundamentally different questions that happen to coincide for symmetric distributions.
Special Cases and Edge Cases
Certain distributions exhibit unusual median behavior that deviates from standard patterns.
Distributions Where Median Equals Mean
Symmetric distributions have medians at their center of symmetry, which coincides with the mean.
Example: Uniform on {1,2,3,4} has medians at both 2 and 3 since the CDF equals exactly 0.5 at 2.
Binomial can have two medians when the CDF jumps from below 0.5 to above 0.5 while passing through exactly 0.5.
Distributions with No Mean but Well-Defined Median
Cauchy Distribution: The median equals the location parameter x0, but the mean does not exist due to heavy tails.
The median provides a measure of central tendency when the mean fails.
Student's t-distribution with ν=1 (Cauchy): median = 0, mean undefined.
Median at Distribution Boundaries
Some distributions can have medians at the edge of their support.
Degenerate distribution at c: Every observation equals c, so median = c trivially.
Beta distributions with extreme skew can push the median very close to boundaries 0 or 1, though typically not exactly at them unless the distribution degenerates.
Multiple Medians in Continuous Distributions
Continuous distributions with flat CDF regions (constant over an interval) have infinitely many medians.
If F(x)=0.5 for all x∈[a,b], then every value in [a,b] qualifies as a median.
This occurs in pathological constructed distributions, not in standard families.
Median When Support Changes
Truncated distributions alter the median by restricting the support.
Truncating the normal distribution to [0,∞) shifts the median away from μ toward positive values since negative values are removed.
The median of the truncated distribution must be recalculated using the new CDF on the restricted support.
Mixture Distributions
Combining distributions can create medians that don't match either component.
Mixture: 0.5⋅N(0,1)+0.5⋅N(10,1)
The median falls near 5 (the midpoint between the two modes), despite neither component having a median near 5.
Median for Discrete Distributions with Gaps
Distributions with gaps in the support (missing values) require careful CDF evaluation.
Example: Support {1,2,5,6} with equal probabilities. The CDF jumps from 0.5 at k=2 to 0.75 at k=5. The median is 2 since it's the smallest value where F(k)≥0.5.
When Median Analysis Fails
For distributions with no clear cumulative structure or undefined CDFs, the median concept becomes meaningless.
Empirical data with heavy discretization or rounding may show artificial median values that don't reflect the underlying continuous distribution.
Notation
The median has several standard notations used across probability and statistics literature.
Common Notations
The most widely used notation is:
median(X)
This explicitly labels the measure being computed.
Alternative notations include:
Med(X)
A compact abbreviation.
x~
or
μ~
The tilde symbol over a variable, commonly used to distinguish median from meanμ or sample mean xˉ.
Q2
The median as the second quartile.
m
A simple variable for the median value.
Relationship to Quantile Notation
The median is mathematically expressed as the 0.5-quantile or 50th percentile:
The median observed in finite data may not reflect the true population median, especially with small samples.
Sampling variability affects median estimation. Confidence intervals help quantify uncertainty.
Large samples and clear distribution structure increase reliability.
Ignoring Median for Ordinal or Ranked Data
When data is naturally ranked but not numerically meaningful (like satisfaction ratings), the median often provides more interpretable results than the mean.
Calculating a mean of {poor, fair, good, excellent} coded as {1,2,3,4} assumes equal spacing that may not exist. The median respects the ordering without assuming equal intervals.
Related Concepts
The median connects to numerous other probability and statistics concepts.
Other Measures of Central Tendency
Mean (Expected Value): The probability-weighted average of all values. Balances the distribution.
Mode: The value with maximum probability or density. Identifies the peak.
These three measures work together to characterize where distributions center and how they're shaped.
Measures of Dispersion
Variance: Quantifies spread around the mean. High variance means values scatter widely.
Interquartile Range (IQR): Measures spread using quartiles: Q3−Q1. Robust to outliers like the median.
The median reveals the center while dispersion measures reveal spread. Both are needed for complete distribution description.
Percentiles and Quantiles
The median is the 50th percentile or 0.5-quantile. It generalizes to other probability thresholds:
Quartiles: Q1 (25th), Q2 (median), Q3 (75th)
Deciles: 10th, 20th, ..., 90th percentiles
Any quantile q: solves F(x)=q
Percentiles divide probability by cumulative area; the median is the special case at 0.5.