Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Median






The Central Value by Position


Not all measures of central tendency are based on averaging. Some describe position rather than magnitude.

This page introduces the median as the value that divides a distribution into two equally weighted halves. It focuses on how the median is defined through ordering and cumulative probability, and why it remains stable even when extreme values distort other measures of center.



Definition and Concept


The median is the value that splits a probability distribution exactly in half. It's the point where half the probability lies below and half lies above.

What is the Median?


For any probability distribution, the median is the 50th percentile—the value mm that divides the total probability into two equal parts:

Discrete distributions: The value mm where cumulative probability reaches or exceeds 0.5 from both directions
Continuous distributions: The unique value mm where P(Xm)=0.5P(X \leq m) = 0.5

If you draw random variables from the distribution repeatedly, half will fall below the median and half will fall above it.

Median as a Measure of Central Tendency


The median is one of three primary measures describing where a distribution centers, alongside the mean and mode.

Unlike the mean, which balances all values through weighted averaging, the median simply identifies the middle position. It cares only about ranking values, not their magnitudes—making it resistant to extreme observations that would distort the mean.

Why the Median Matters


The median provides crucial insight into distribution structure:

Where is the center? The median marks the probability midpoint
How symmetric is the distribution? Comparing median to mean reveals skewness
Are there outliers? The median stays stable when extreme values appear

For income data, housing prices, or any measurement prone to outliers, the median often represents the "typical" value better than the mean does.

Median vs Other Measures


The median behaves distinctly from mean and mode:

Robustness: Extreme values beyond the 50% threshold have zero influence
Always unique for continuous distributions: Unlike the mode, which can be multiple or absent
May differ from mean: Skewed distributions separate median and mean substantially
Interpretability: Represents an achievable middle value, not an abstract balance point

The median complements mean and mode by revealing the distribution's center through probability division rather than probability weighting or probability concentration.

Median for Discrete Distributions


For discrete distributions, the median is the smallest value where cumulative probability reaches or crosses the halfway point.

Definition


The median of a discrete random variable XX is the smallest value mm satisfying both conditions:

P(Xm)0.5andP(Xm)0.5P(X \leq m) \geq 0.5 \quad \text{and} \quad P(X \geq m) \geq 0.5


This ensures that at least half the probability lies below (or at) mm, and at least half lies above (or at) mm.

Why Two Conditions?


The discrete nature creates jumps in the cumulative distribution function. A single value might satisfy one condition but not the other.

The two-condition definition guarantees the median genuinely divides probability, even when the CDF jumps past 0.5 without landing exactly on it.

How to Find the Median


Unlike expected value, there's no universal formula. Find the median through cumulative probability:

1. Calculate the CDF: F(k)=P(Xk)F(k) = P(X \leq k) for each value in the support
2. Identify where F(k)F(k) first reaches or exceeds 0.5
3. Verify both conditions hold at that value
4. If multiple consecutive values satisfy both conditions, any of them qualifies as a median

Medians for Common Discrete Distributions


Discrete Uniform on {a,a+1,,b}\{a, a+1, \ldots, b\}

Median: a+b2\frac{a+b}{2} when a+ba+b is even

When a+ba+b is odd, both a+b12\frac{a+b-1}{2} and a+b+12\frac{a+b+1}{2} satisfy the median conditions.

Example: Uniform on {1,2,3,4,5}\{1,2,3,4,5\} has median 33

Binomial with parameters nn and pp

Median: No simple closed form. Approximation: median np\approx np for large nn

Requires numerical computation or tables for exact values.

Example: For n=10,p=0.5n=10, p=0.5, numerical evaluation gives median =5= 5

Geometric with parameter pp

Median: m=ln(2)ln(1p)m = \lceil \frac{-\ln(2)}{\ln(1-p)} \rceil

Example: For p=0.3p = 0.3, median =1.98=2= \lceil 1.98 \rceil = 2

Negative Binomial with parameters rr and pp

Median: No closed form. Must evaluate CDF numerically.

Hypergeometric with parameters N,K,nN, K, n

Median: No closed form. Numerical evaluation required.

Poisson with parameter λ\lambda

Median: Approximation: median λ+130.02λ\approx \lambda + \frac{1}{3} - \frac{0.02}{\lambda} for large λ\lambda

For small λ\lambda, numerical evaluation of the CDF is necessary.

Non-Uniqueness


Discrete distributions can have multiple medians when the CDF jumps from below 0.5 to above 0.5 in a single step, landing on exactly 0.5 at the jump point.

Example: Discrete Uniform on {1,2,3,4}\{1, 2, 3, 4\} has CDF values 0.25,0.5,0.75,1.00.25, 0.5, 0.75, 1.0. Both 22 and 33 satisfy the median conditions. Either value is a valid median.

Visual Identification


On a cumulative probability plot, the median is where the step function first touches or crosses the horizontal line at 0.5. The discrete jumps make this crossing potentially ambiguous.

Key Properties


• The median always lies within the support
• Finding it requires evaluating the CDF, not the PMF directly
• Multiple values may qualify as medians in certain cases
• Unlike the mode, the median depends on cumulative structure, not individual probabilities
• Changing tail probabilities can shift the median, even if the most probable value stays fixed

Median for Continuous Distributions


For continuous distributions, the median is the unique value where the cumulative distribution function equals exactly 0.5.

Definition


The median of a continuous random variable XX is the value mm satisfying:

F(m)=P(Xm)=0.5F(m) = P(X \leq m) = 0.5


Equivalently, using the probability density function:

mf(x)dx=0.5\int_{-\infty}^{m} f(x)dx = 0.5


This is the point that divides the area under the density curve into two equal halves.

How to Find the Median


The median can often be found analytically by solving the CDF equation:

1. Write the CDF: F(x)=P(Xx)F(x) = P(X \leq x)
2. Set F(m)=0.5F(m) = 0.5
3. Solve for mm algebraically
4. Use the inverse CDF when it exists: m=F1(0.5)m = F^{-1}(0.5)

When no closed-form inverse exists, numerical methods locate the median through root-finding algorithms.

Medians for Common Continuous Distributions


Continuous Uniform on [a,b][a, b]

Median: m=a+b2m = \frac{a+b}{2}

The midpoint of the interval.

Normal with parameters μ,σ\mu, \sigma

Median: m=μm = \mu

Symmetry ensures median equals mean.

Exponential with rate parameter λ\lambda

Median: m=ln(2)λ0.693λm = \frac{\ln(2)}{\lambda} \approx \frac{0.693}{\lambda}

Derivation: The CDF is F(x)=1eλxF(x) = 1 - e^{-\lambda x}. Setting F(m)=0.5F(m) = 0.5 gives:

1eλm=0.51 - e^{-\lambda m} = 0.5


eλm=0.5e^{-\lambda m} = 0.5


m=ln(2)λm = \frac{\ln(2)}{\lambda}


Beta Distribution with parameters α,β\alpha, \beta

Median: No general closed form except for special cases

When α=β\alpha = \beta, median =0.5= 0.5 by symmetry

Numerical methods required for general α,β\alpha, \beta

Gamma Distribution with shape kk and rate θ\theta

Median: No closed form

For k=1k=1, reduces to exponential with median ln(2)θ\frac{\ln(2)}{\theta}

Numerical evaluation required for k>1k > 1

Weibull Distribution with shape kk and scale λ\lambda

Median: m=λ(ln2)1/km = \lambda (\ln 2)^{1/k}

Derived from the CDF F(x)=1e(x/λ)kF(x) = 1 - e^{-(x/\lambda)^k}

Cauchy Distribution with location x0x_0 and scale γ\gamma

Median: m=x0m = x_0

The median equals the location parameter. The mean does not exist for this distribution.

Lognormal Distribution with parameters μ,σ\mu, \sigma

Median: m=eμm = e^\mu

The median of the lognormal equals the exponential of the underlying normal's mean.

Triangular Distribution on [a,c,b][a, c, b] with mode cc

Median depends on the position of mode cc:

If ca+b2c \geq \frac{a+b}{2}: median =a+(ba)(ca)2= a + \sqrt{\frac{(b-a)(c-a)}{2}}

If c<a+b2c < \frac{a+b}{2}: median =b(ba)(bc)2= b - \sqrt{\frac{(b-a)(b-c)}{2}}

Uniqueness


Continuous distributions with strictly increasing CDFs have exactly one median. The smoothness of the CDF eliminates the ambiguity present in discrete cases.

If the CDF has flat regions (constant over an interval), any value in that interval satisfies the median definition, creating non-uniqueness.

Relationship to Density


The median need not coincide with the peak of the PDF. Skewed distributions separate the median from the mode.

The median divides probability mass equally, while the mode identifies maximum density. These are distinct concepts that align only in symmetric distributions.

Visual Identification


On a density curve, the median is the vertical line that splits the area under the curve into two equal parts. For symmetric distributions, this line passes through the peak. For skewed distributions, the median sits between the mode and mean.

Key Properties


• Continuous distributions typically have a unique median
• The median can be found by solving F(m)=0.5F(m) = 0.5 or using F1(0.5)F^{-1}(0.5)
• Unlike discrete cases, ties are impossible due to the smooth nature of the CDF
• The median is always in the interior of the support for distributions with unbounded support
• For bounded support, the median may approach but typically doesn't reach the boundaries

Discrete vs Continuous Median


The median behaves differently for discrete and continuous distributions due to fundamental differences in how probability distributes.

Uniqueness


Continuous distributions: The median is unique. The CDF is smooth and crosses 0.5 at exactly one point.

Discrete distributions: Multiple values may satisfy the median conditions when the CDF jumps from below 0.5 to above 0.5 in a single step.

Example: Discrete uniform on {1,2,3,4}\{1, 2, 3, 4\} has both 2 and 3 as valid medians.

Definition Complexity


Continuous case: Simple condition F(m)=0.5F(m) = 0.5

Discrete case: Requires both P(Xm)0.5P(X \leq m) \geq 0.5 and P(Xm)0.5P(X \geq m) \geq 0.5 to handle probability jumps properly

The two-condition requirement for discrete distributions ensures the median genuinely divides probability despite the CDF's discontinuities.

Computational Methods


Continuous: Solve F(m)=0.5F(m) = 0.5 analytically or use inverse CDF F1(0.5)F^{-1}(0.5)

Discrete: Evaluate the CDF at each support value, find where cumulative probability first reaches 0.5, verify both conditions

Continuous distributions often have closed-form median expressions. Discrete distributions typically require numerical evaluation or lookup tables.

Median as an Interval


Discrete: When multiple values satisfy the conditions, the median is technically an interval rather than a single point

Continuous: Always a single point (except for pathological cases with flat CDF regions)

In practice, discrete medians are reported as the smallest qualifying value, even when an interval exists.

Visual Differences


Continuous: The PDF curve shows the median as the vertical line splitting area equally

Discrete: The PMF bar chart requires examining cumulative probabilities to locate the median

Identifying the median visually is more intuitive for continuous distributions than discrete ones.

Impact on Calculations


Continuous: Integration yields exact median through mf(x)dx=0.5\int_{-\infty}^{m} f(x)dx = 0.5

Discrete: Summation requires careful handling of boundary cases: kmP(X=k)0.5\sum_{k \leq m} P(X = k) \geq 0.5

The discrete case demands attention to whether inequalities are strict or non-strict.

Relationship to Other Measures


Both types: Median can differ from mean and mode in skewed distributions

Continuous: The separation is smooth and predictable based on distribution shape

Discrete: The separation can be irregular due to probability concentration at specific points

Why Continuous is Simpler


The smoothness of continuous CDFs eliminates ambiguity. Every strictly increasing CDF crosses 0.5 at exactly one location, making median identification straightforward.

Discrete distributions introduce complexity through jumps, ties, and the possibility of no value landing precisely at the 50% cumulative probability mark.

Median, Mean, and Mode Compared


Three measures describe where distributions center: median, mean, and mode. Each reveals different structural features.

Quick Definitions


Median: The 50th percentile that divides total probability equally

Mean: Weighted balance point calculated from all values

Mode: Peak location—where probability or density reaches maximum

Symmetric Distributions


When distributions mirror themselves around a center point, all three measures collapse to the same value.

Normal distribution: median = mean = mode = μ\mu

Discrete Uniform on {1,2,3,4,5}\{1,2,3,4,5\}: all three equal 33

Perfect symmetry forces the probability split, the balance point, and the peak to occupy identical positions.

Skewed Distributions


Asymmetry separates the three measures in consistent patterns.

Right skew (tail stretches toward larger values):

mode<median<mean\text{mode} < \text{median} < \text{mean}


Extreme large values drag the mean rightward. The median holds closer to the probability bulk. The mode stays fixed at the density peak.

Exponential distribution: mode = 00, median = ln(2)λ\frac{\ln(2)}{\lambda}, mean = 1λ\frac{1}{\lambda}

Left skew (tail stretches toward smaller values):

mean<median<mode\text{mean} < \text{median} < \text{mode}


The ordering reverses—mean pulled left, mode anchored at the right peak.

Comparison Table

Comparing Median, Mean, and Mode

FeatureMedianMeanMode
DefinitionValue at 50th percentileProbability-weighted averageValue with highest probability
CalculationFind cumulative 0.5 pointWeight all valuesFind maximum
UniquenessSingle value (continuous)Single valueCan have multiple
Outlier impactNoneStrongNone
Data typeNumerical onlyNumerical onlyAny (including categorical)
InterpretationMiddle valueAverage outcomeMost likely value
Skewness indicationBetween mode and meanPulled by tailAt peak
,


Robustness Differences


Mean: Vulnerable. One extreme observation can shift it substantially.

Median: Resistant. Values beyond the 50% threshold have no influence.

Mode: Immune. Tail behavior irrelevant unless it creates a new peak.

Income data illustrates this: billionaires inflate the mean drastically while leaving median and mode nearly unchanged.

Selection Criteria


Choose median for:
• Skewed distributions
• Data contaminated by outliers
• Representing a "central" value that's actually achievable
• Reporting typical values when extremes distort the mean

Choose mean for:
• Symmetric data without extreme values
• Leveraging mathematical properties (additivity, scaling)
• Incorporating all observations equally

Choose mode for:
• Categorical outcomes (colors, brands, types)
• Identifying the most frequent occurrence
• Detecting multiple concentration points

Spatial Relationships


Symmetric case: All three occupy the same point at distribution center.

Right-skewed case: Mode sits at the left peak, median slightly right, mean furthest right chasing the tail.

Left-skewed case: Reversed ordering with mean leftmost, mode rightmost.

The median always falls between the mode and mean in skewed distributions, acting as a compromise measure that balances probability division against probability concentration.

Properties of the Median


The median exhibits specific mathematical behaviors that distinguish it from other central tendency measures.

Minimizes Absolute Deviation


The median minimizes the expected value of absolute deviation:

median(X)=argminmE[Xm]\text{median}(X) = \arg\min_m E[|X - m|]


Among all possible values mm, the median produces the smallest average absolute distance from XX. This contrasts with the mean, which minimizes squared deviations: E[(Xμ)2]E[(X - \mu)^2].

Transformation Under Linear Operations


For a random variable XX with median mm, consider the transformation Y=aX+bY = aX + b where a0a \neq 0.

The median of YY is:

median(Y)=amedian(X)+b\text{median}(Y) = a \cdot \text{median}(X) + b


Linear transformations shift and scale the median predictably. Multiply by aa, add bb—the middle point moves accordingly.

Example: If XX has median 44, then Y=3X2Y = 3X - 2 has median 3(4)2=103(4) - 2 = 10.

Invariance Under Monotonic Transformations


For strictly monotonic functions gg (always increasing or always decreasing):

median(g(X))=g(median(X))\text{median}(g(X)) = g(\text{median}(X))


If gg is strictly increasing, it preserves the ordering of values, so the 50th percentile stays at the same relative position.

Example: If XX has median 55, then Y=X2Y = X^2 has median 2525 (assuming X>0X > 0).

This property holds for the median and mode but fails for the mean.

Robustness to Outliers


The median completely ignores values beyond the 50th percentile threshold. Extreme observations in either tail have zero impact on the median as long as they don't cross the middle rank.

Change every value above the median by any amount—the median stays fixed as long as the cumulative probability structure at the 50th percentile remains unchanged.

This makes the median ideal for data with contamination or measurement errors in the tails.

Uniqueness for Continuous Distributions


Continuous distributions with strictly increasing CDFs have exactly one median. The smoothness eliminates ambiguity.

Discrete distributions may have multiple medians when the CDF jumps across 0.5.

Relationship to Symmetry


For symmetric distributions centered at cc:

median(X)=c=mean(X)\text{median}(X) = c = \text{mean}(X)


Symmetry forces the 50th percentile and the balance point to coincide at the center of symmetry.

Skewed distributions separate median from mean, with the median positioned between the mode and mean.

Independence from Distribution Spread


The median reveals nothing about variance or spread. Two distributions can share the same median while having vastly different variability.

Example: Normal distributions N(0,1)N(0, 1) and N(0,100)N(0, 100) both have median 00, yet spread differs by a factor of 100.

The median operates independently of scale—it tracks the probability midpoint, not dispersion.

No Additivity


Unlike the mean, medians don't add:

median(X+Y)median(X)+median(Y)\text{median}(X + Y) \neq \text{median}(X) + \text{median}(Y)


Even for independent XX and YY, the 50th percentile of the sum doesn't generally equal the sum of individual 50th percentiles.

This limits the median's usefulness in probability calculations involving sums or combinations.

How to Find the Median


The method for finding the median depends on whether the distribution is discrete or continuous.

For Discrete Distributions


Step 1: Calculate the CDF

Compute cumulative probabilities F(k)=P(Xk)F(k) = P(X \leq k) for each value in the support using the probability mass function.

Step 2: Find where CDF reaches 0.5

Identify the smallest value kk where F(k)0.5F(k) \geq 0.5.

Step 3: Verify both conditions

Check that both P(Xk)0.5P(X \leq k) \geq 0.5 and P(Xk)0.5P(X \geq k) \geq 0.5 hold.

Step 4: Handle non-uniqueness

If multiple consecutive values satisfy both conditions, any of them qualifies as a median. Report the smallest value or acknowledge the interval.

Example: Binomial with n=10,p=0.5n = 10, p = 0.5

Compute CDF values for k=0,1,2,,10k = 0, 1, 2, \ldots, 10. The CDF reaches F(5)=0.6230.5F(5) = 0.623 \geq 0.5, and P(X5)=0.6230.5P(X \geq 5) = 0.623 \geq 0.5.

Median = 55.

Shortcut for known distributions: Many standard distributions have analytical approximations or tables:
Geometric: median = ln(2)ln(1p)\lceil \frac{-\ln(2)}{\ln(1-p)} \rceil
Poisson: median λ\approx \lambda for large λ\lambda

For Continuous Distributions


Step 1: Write the CDF

Start with the cumulative distribution function F(x)=P(Xx)F(x) = P(X \leq x).

Step 2: Set equation to 0.5

Solve F(m)=0.5F(m) = 0.5 for mm.

Step 3: Solve algebraically

Use inverse CDF when it exists: m=F1(0.5)m = F^{-1}(0.5).

Rearrange the equation to isolate mm and solve directly.

Step 4: Apply numerical methods if needed

When no closed form exists, use root-finding algorithms:
• Bisection method
• Newton-Raphson method
• Built-in quantile functions in statistical software

Example: Exponential distribution with rate λ\lambda

The CDF is F(x)=1eλxF(x) = 1 - e^{-\lambda x}.

Setting F(m)=0.5F(m) = 0.5:

1eλm=0.51 - e^{-\lambda m} = 0.5


eλm=0.5e^{-\lambda m} = 0.5


λm=ln(0.5)-\lambda m = \ln(0.5)


m=ln(0.5)λ=ln(2)λm = \frac{-\ln(0.5)}{\lambda} = \frac{\ln(2)}{\lambda}


Median = ln(2)λ\frac{\ln(2)}{\lambda}.

Example: Normal distribution with parameters μ,σ\mu, \sigma

By symmetry, the median equals the mean:

Median = μ\mu.

No calculation needed for symmetric distributions.

Using Statistical Software


Modern software provides built-in median functions.

These functions implement the inverse CDF method automatically.

From Sample Data


When working with observed data rather than theoretical distributions:

Step 1: Sort the data

Arrange values in ascending order: x1x2xnx_1 \leq x_2 \leq \cdots \leq x_n.

Step 2: Find the middle

For odd sample size nn: median = x(n+1)/2x_{(n+1)/2}

For even sample size nn: median = xn/2+xn/2+12\frac{x_{n/2} + x_{n/2+1}}{2}

Example: Data {3,7,1,9,5}\{3, 7, 1, 9, 5\}

Sorted: {1,3,5,7,9}\{1, 3, 5, 7, 9\}

Median = 55 (middle value).

Example: Data {3,7,1,9}\{3, 7, 1, 9\}

Sorted: {1,3,7,9}\{1, 3, 7, 9\}

Median = 3+72=5\frac{3 + 7}{2} = 5.

Visual Methods


For quick estimation:

Discrete: Examine the cumulative probability plot, find where steps cross 0.5

Continuous: Locate the area split point on the PDF curve where left area equals right area

Visual methods provide intuition but lack precision for formal analysis.

Median as the 50th Percentile


The median is the 50th percentile of a distribution—the value below which 50% of the probability lies.

Percentiles and Quantiles


A percentile divides a distribution at a specific cumulative probability. The pp-th percentile is the value xpx_p where:

P(Xxp)=p100P(X \leq x_p) = \frac{p}{100}


The median is the special case where p=50p = 50:

P(Xm)=0.5P(X \leq m) = 0.5


Quantiles use fractional notation instead of percentages. The 0.5-quantile equals the 50th percentile equals the median.

Relationship to Quartiles


Quartiles divide the distribution into four equal parts:

• First quartile Q1Q_1 (25th percentile): P(XQ1)=0.25P(X \leq Q_1) = 0.25
• Second quartile Q2Q_2 (50th percentile): P(XQ2)=0.5P(X \leq Q_2) = 0.5 — this is the median
• Third quartile Q3Q_3 (75th percentile): P(XQ3)=0.75P(X \leq Q_3) = 0.75

The median sits at the center of the quartile structure, with equal probability mass on each side.

Five-Number Summary


The median anchors the five-number summary used in descriptive statistics:

Minimum, Q1, Median, Q3, Maximum\text{Minimum, } Q_1, \text{ Median, } Q_3, \text{ Maximum}


This summary captures distribution shape through five key positions. Box plots visualize this summary, displaying the median as the central line inside the box.

Interquartile Range


The interquartile range (IQR) measures spread using quartiles:

IQR=Q3Q1\text{IQR} = Q_3 - Q_1


This range contains the middle 50% of the distribution, centered at the median. The IQR provides a robust measure of spread that ignores extreme values, complementing the median's role as a robust measure of center.

General Percentile Formula


For continuous distributions, the pp-th percentile solves:

F(xp)=pF(x_p) = p


Using the inverse CDF:

xp=F1(p)x_p = F^{-1}(p)


The median is F1(0.5)F^{-1}(0.5).

For discrete distributions, the pp-th percentile is the smallest value xx where:

P(Xx)pP(X \leq x) \geq p


Extension to Other Percentiles


Beyond quartiles, distributions can be divided into:

• Deciles: 10th, 20th, ..., 90th percentiles (divide into 10 parts)
• Percentiles: 1st through 99th percentiles (divide into 100 parts)
• Any quantile: solve F(x)=qF(x) = q for any 0<q<10 < q < 1

The median generalizes to arbitrary probability splits. Just as the median splits at 0.5, other percentiles split at different thresholds.

Why the 50th Percentile Matters


The 50th percentile is unique among percentiles:

• Divides probability exactly in half
• Minimizes absolute deviation E[Xm]E[|X - m|]
• Robust to extreme values in either tail
• Natural midpoint for symmetric distributions

Other percentiles describe spread or tail behavior, but the median describes central location through probability balance.

Computing Other Percentiles


The same methods used to find the median apply to any percentile:

Continuous: Solve F(xp)=pF(x_p) = p or use xp=F1(p)x_p = F^{-1}(p)

Discrete: Find smallest xx where P(Xx)pP(X \leq x) \geq p

Statistical software provides quantile functions: qnorm(p) in R, ppf(p) in Python's scipy.stats.

Median vs Mean in Percentile Context


The mean is not a percentile—it's a weighted average, not a cumulative probability threshold.

The median answers "what value splits the distribution?" The mean answers "what is the balance point?" These are fundamentally different questions that happen to coincide for symmetric distributions.

Special Cases and Edge Cases


Certain distributions exhibit unusual median behavior that deviates from standard patterns.

Distributions Where Median Equals Mean


Symmetric distributions have medians at their center of symmetry, which coincides with the mean.

Normal: median = mean = μ\mu

Uniform distributions: median = mean = a+b2\frac{a+b}{2}

Any distribution with mirror symmetry around a point cc has median = mean = cc.

Distributions Where Median Differs Substantially from Mean


Skewed distributions separate median and mean significantly.

Exponential: median = ln(2)λ0.693/λ\frac{\ln(2)}{\lambda} \approx 0.693/\lambda, mean = 1λ\frac{1}{\lambda}

The gap is approximately 0.307/λ0.307/\lambda. For λ=1\lambda = 1, median 0.693\approx 0.693 while mean =1= 1.

Lognormal: Highly right-skewed with median << mean. The median is eμe^\mu while the mean is eμ+σ2/2e^{\mu + \sigma^2/2}, which can be orders of magnitude larger.

Non-Unique Medians in Discrete Distributions


Discrete Uniform on even number of values has two medians.

Example: Uniform on {1,2,3,4}\{1,2,3,4\} has medians at both 22 and 33 since the CDF equals exactly 0.50.5 at 22.

Binomial can have two medians when the CDF jumps from below 0.50.5 to above 0.50.5 while passing through exactly 0.50.5.

Distributions with No Mean but Well-Defined Median


Cauchy Distribution: The median equals the location parameter x0x_0, but the mean does not exist due to heavy tails.

The median provides a measure of central tendency when the mean fails.

Student's t-distribution with ν=1\nu = 1 (Cauchy): median = 00, mean undefined.

Median at Distribution Boundaries


Some distributions can have medians at the edge of their support.

Degenerate distribution at cc: Every observation equals cc, so median = cc trivially.

Beta distributions with extreme skew can push the median very close to boundaries 00 or 11, though typically not exactly at them unless the distribution degenerates.

Multiple Medians in Continuous Distributions


Continuous distributions with flat CDF regions (constant over an interval) have infinitely many medians.

If F(x)=0.5F(x) = 0.5 for all x[a,b]x \in [a,b], then every value in [a,b][a,b] qualifies as a median.

This occurs in pathological constructed distributions, not in standard families.

Median When Support Changes


Truncated distributions alter the median by restricting the support.

Truncating the normal distribution to [0,)[0, \infty) shifts the median away from μ\mu toward positive values since negative values are removed.

The median of the truncated distribution must be recalculated using the new CDF on the restricted support.

Mixture Distributions


Combining distributions can create medians that don't match either component.

Mixture: 0.5N(0,1)+0.5N(10,1)0.5 \cdot N(0,1) + 0.5 \cdot N(10,1)

The median falls near 55 (the midpoint between the two modes), despite neither component having a median near 55.

Median for Discrete Distributions with Gaps


Distributions with gaps in the support (missing values) require careful CDF evaluation.

Example: Support {1,2,5,6}\{1, 2, 5, 6\} with equal probabilities. The CDF jumps from 0.50.5 at k=2k=2 to 0.750.75 at k=5k=5. The median is 22 since it's the smallest value where F(k)0.5F(k) \geq 0.5.

When Median Analysis Fails


For distributions with no clear cumulative structure or undefined CDFs, the median concept becomes meaningless.

Empirical data with heavy discretization or rounding may show artificial median values that don't reflect the underlying continuous distribution.

Notation


The median has several standard notations used across probability and statistics literature.

Common Notations


The most widely used notation is:

median(X)\text{median}(X)


This explicitly labels the measure being computed.

Alternative notations include:

Med(X)\text{Med}(X)


A compact abbreviation.

x~\tilde{x}
or
μ~\tilde{\mu}


The tilde symbol over a variable, commonly used to distinguish median from mean μ\mu or sample mean xˉ\bar{x}.

Q2Q_2


The median as the second quartile.

mm


A simple variable for the median value.

Relationship to Quantile Notation


The median is mathematically expressed as the 0.5-quantile or 50th percentile:

median(X)=F1(0.5)\text{median}(X) = F^{-1}(0.5)


where F1F^{-1} denotes the inverse cumulative distribution function.

Alternative quantile notations:

x0.5x_{0.5}
or
Q0.5Q_{0.5}


Both indicate the value at cumulative probability 0.5.

In Statistical Context


Sample median (from data) vs population median (from distribution) may be distinguished:

x~\tilde{x} or median(sample)\text{median}(\text{sample}) for the observed median

mm or median(X)\text{median}(X) for the theoretical population median

Context usually makes this distinction clear without special notation.

Comparison with Mean Notation


Mean: μ\mu, E[X]E[X], or xˉ\bar{x} (sample mean)

Median: x~\tilde{x}, Med(X)\text{Med}(X), or Q2Q_2

The tilde distinguishes median from mean in formulas where both appear.

No Universal Standard


Unlike variance (σ2\sigma^2 or Var(X)\text{Var}(X)), the median lacks a single universally adopted symbol. Different sources use different conventions.

Always define your notation explicitly when writing technical work to avoid confusion.

See All Probability Symbols and Notations

Common Mistakes


Several recurring errors appear when working with the median.

Confusing Median with Mean


The two measures are fundamentally different:

Median = 50th percentile

Mean = weighted average

Using "median" when you mean "average" is incorrect. The median identifies the middle value by probability rank, not the balance point of all values.

Assuming Median Always Equals Mean


The median equals the mean only for symmetric distributions.

Skewed distributions separate them substantially. Exponential distributions have median far below the mean.

Never assume equality without checking distribution symmetry.

Forgetting the Two-Condition Rule for Discrete Distributions


Discrete distributions require both P(Xm)0.5P(X \leq m) \geq 0.5 and P(Xm)0.5P(X \geq m) \geq 0.5.

Checking only one condition can give incorrect results when the CDF jumps past 0.5.

Always verify both conditions hold.

Assuming the Median is Always Unique


Discrete distributions can have multiple medians when the CDF lands exactly on 0.5.

Example: Discrete Uniform on {1,2,3,4}\{1,2,3,4\} has both 22 and 33 as valid medians.

Never assume exactly one median without checking the probability structure.

Computing Median from PMF Instead of CDF


The median requires cumulative probability, not individual probabilities.

Finding the median from the PMF directly without accumulation leads to errors.

Always work with the CDF when locating the median.

Thinking Median Must Be in the Dataset


For continuous distributions, the median is any value satisfying F(m)=0.5F(m) = 0.5, which may not appear in observed data.

For even-sized samples, the sample median is often the average of two middle values, creating a median not present in the original dataset.

The median is a calculated position, not necessarily an observed value.

Using Median When Mean is More Appropriate


The median is ideal for skewed data and outlier resistance, but the mean has superior mathematical properties for symmetric distributions.

The mean supports algebraic operations (additivity, scaling) that the median lacks.

Choose based on distribution shape and analysis goals, not by default.

Assuming Median is Always "Central"


The median marks the 50th percentile, which may not align with intuitive notions of "center."

For highly skewed distributions, the median sits far from the mode or the bulk of probability mass.

The median represents probability division, not necessarily spatial centrality.

Confusing Median with Mode


Median = probability split point

Mode = peak probability

These are distinct concepts. The Exponential distribution has mode at 00 but median at ln(2)λ\frac{\ln(2)}{\lambda}.

Never use the terms interchangeably.

Treating Sample Median as Population Median


The median observed in finite data may not reflect the true population median, especially with small samples.

Sampling variability affects median estimation. Confidence intervals help quantify uncertainty.

Large samples and clear distribution structure increase reliability.

Ignoring Median for Ordinal or Ranked Data


When data is naturally ranked but not numerically meaningful (like satisfaction ratings), the median often provides more interpretable results than the mean.

Calculating a mean of {\{poor, fair, good, excellent}\} coded as {1,2,3,4}\{1,2,3,4\} assumes equal spacing that may not exist. The median respects the ordering without assuming equal intervals.

Related Concepts


The median connects to numerous other probability and statistics concepts.

Other Measures of Central Tendency


Mean (Expected Value): The probability-weighted average of all values. Balances the distribution.

Mode: The value with maximum probability or density. Identifies the peak.

These three measures work together to characterize where distributions center and how they're shaped.

Measures of Dispersion


Variance: Quantifies spread around the mean. High variance means values scatter widely.

Interquartile Range (IQR): Measures spread using quartiles: Q3Q1Q_3 - Q_1. Robust to outliers like the median.

The median reveals the center while dispersion measures reveal spread. Both are needed for complete distribution description.

Percentiles and Quantiles


The median is the 50th percentile or 0.5-quantile. It generalizes to other probability thresholds:

Quartiles: Q1Q_1 (25th), Q2Q_2 (median), Q3Q_3 (75th)

Deciles: 10th, 20th, ..., 90th percentiles

Any quantile qq: solves F(x)=qF(x) = q

Percentiles divide probability by cumulative area; the median is the special case at 0.5.

Cumulative Distribution Function


CDF: The median solves F(m)=0.5F(m) = 0.5.

The inverse CDF F1F^{-1} maps probabilities to values: m=F1(0.5)m = F^{-1}(0.5).

Understanding the CDF is essential for finding medians.

Probability Functions


Probability Mass Function (PMF): For discrete distributions, accumulating the PMF gives the CDF needed to find the median.

Probability Density Function (PDF): For continuous distributions, integrating the PDF to 0.5 locates the median.

The median depends on cumulative structure, not pointwise probabilities or densities directly.

Skewness


Skewness measures asymmetry. The relative positions of median, mean, and mode directly indicate skewness direction:

Right skew: mode << median << mean

Left skew: mean << median << mode

Symmetric: mode = median = mean

Specific Distribution Families


Discrete Distributions: Binomial, Geometric, Poisson, and others each have characteristic median behaviors.

Continuous Distributions: Normal, Exponential, Beta, and others show diverse median patterns.

Understanding distribution-specific medians helps identify which model fits observed data.

Robustness


The median is a robust statistic—resistant to outliers and extreme values.

Median Absolute Deviation (MAD): A robust measure of spread based on the median: MAD=median(Xmedian(X))\text{MAD} = \text{median}(|X - \text{median}(X)|).

Robust methods use the median as a foundational concept for estimation in contaminated data.

Order Statistics


In a sample of size nn, the median is the middle order statistic.

For odd nn: median is the n+12\frac{n+1}{2}-th smallest value

For even nn: median is the average of the n2\frac{n}{2}-th and n2+1\frac{n}{2}+1-th smallest values

Order statistics formalize ranking and selection procedures.

Optimization


The median minimizes absolute deviation: E[Xm]E[|X - m|] is minimized at the median.

The mean minimizes squared deviation: E[(Xμ)2]E[(X - \mu)^2] is minimized at the mean.

These optimization properties distinguish the two measures fundamentally.