Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Cumulative Distribution Function (CDF)






Accumulating Probability


When working with random variables, we are often not interested in the probability of an exact value, but in how much probability lies up to a certain point. Questions like “How likely is the value to be below this threshold?” or “What fraction of outcomes fall to the left of this number?” appear naturally.

The cumulative distribution function captures this idea directly. Instead of focusing on individual outcomes, it tracks how probability accumulates as values increase along the number line. This single object provides a unified way to describe distributions, whether they are discrete, continuous, or a mixture of both.

The sections that follow explain how this accumulation works, how it is defined formally, and why the cumulative distribution function plays a central role in probability theory.



From Events to Random Variables


Probability is first defined for events, but many questions involve numerical values rather than yes-or-no outcomes. A random variable connects these two views by assigning a number to each outcome of an experiment.

Once a random variable is defined, statements about its value naturally form events. Expressions like XlexX le x, X>xX > x, or a<Xleba < X le b describe collections of outcomes and therefore have probabilities attached to them.

The cumulative distribution function is built exactly on this connection. It takes events of the form XlexX le x and assigns to each value xx the probability of that event, linking random variables back to the event-based foundation of probability.

CDF Defined (Core Meaning)


The cumulative distribution function describes how probability is distributed along the values of a random variable. For each possible value, it tells us how much probability has accumulated up to that point.

Instead of asking whether the random variable takes one specific value, the CDF answers a broader question: how likely it is that the value does not exceed a given level. As the value increases, the accumulated probability can only stay the same or grow.

This perspective is what makes the CDF so powerful. It focuses on accumulation rather than individual outcomes, providing a single, consistent way to describe the behavior of a random variable across its entire range.

Mathematical Definition


Formally, the cumulative distribution function of a random variable XX is defined by

FX(x)=P(Xx)F_X(x) = P(X \le x)

For each real number xx, the CDF assigns the probability that the random variable takes a value less than or equal to xx. The function FXF_X maps numbers on the real line to values between 0 and 1.

The choice of “less than or equal to” is not arbitrary. It ensures that the CDF behaves consistently across discrete, continuous, and mixed random variables, allowing a single definition to cover all cases.

Key Properties of the CDF


    The cumulative distribution function obeys several fundamental properties that follow directly from the probability axioms.

  • Values between 0 and 1
    For every xx, the CDF satisfies 0FX(x)10 \le F_X(x) \le 1.

  • Non-decreasing
    As xx increases, FX(x)F_X(x) can only stay the same or increase. Accumulated probability never decreases.

  • Right-continuous
    The value of the CDF at a point includes the probability at that point, which ensures consistent behavior for jumps in discrete cases.

  • Limits at infinity
    As xx \to -\infty, FX(x)0F_X(x) \to 0.
    As x+x \to +\infty, FX(x)1F_X(x) \to 1.

  • These properties are what distinguish a valid CDF from an arbitrary function and guarantee that it represents a genuine probability distribution.

Discrete Random Variables


For a discrete random variable, probability is concentrated at specific values. The CDF reflects this by increasing only at those values and remaining flat everywhere else.

In this case, the CDF is a step function. Each jump corresponds to a value the random variable can take, and the size of the jump equals the probability assigned to that value.

This makes the connection to the probability mass function clear: the PMF determines the jump sizes, while the CDF accumulates those jumps as the value increases. Reading the CDF from left to right shows how probability builds up one possible value at a time.

Continuous Random Variables


For a continuous random variable, probability is spread smoothly over intervals rather than concentrated at individual points. As a result, the CDF increases continuously instead of jumping.

In this setting, the CDF describes how probability accumulates as the value grows along the real line. The slope of this accumulation is captured by the probability density function, which indicates how rapidly probability is building at each point.

A key consequence is that a continuous random variable assigns zero probability to any single exact value. Probabilities are obtained only by looking at intervals, and the CDF provides a direct way to compute them.

Mixed Distributions


Some random variables do not fit neatly into purely discrete or purely continuous categories. They may assign positive probability to certain specific values while also spreading probability continuously over intervals.

The CDF handles this naturally. It combines flat regions, smooth increases, and sudden jumps within a single function. Jumps represent discrete probability masses, while smooth segments represent continuously distributed probability.

This is one reason the CDF is more general than the PMF or PDF. Even when neither of those fully describes a distribution on its own, the CDF still provides a complete and consistent representation.

Using the CDF to Compute Probabilities


One of the main advantages of the cumulative distribution function is that it allows probabilities to be computed directly from differences of values.

For any two numbers a<ba < b, the probability that the random variable lies between them is obtained by subtracting accumulated probabilities:

P(a<Xb)=FX(b)FX(a)P(a < X \le b) = F_X(b) - F_X(a)

This works uniformly for discrete, continuous, and mixed random variables. One-sided probabilities are handled just as easily by reading the CDF at a single point.

Because of this, the CDF often simplifies probability calculations by replacing sums or integrals with simple evaluations of a single function.

CDF vs PMF vs PDF


The CDF, PMF, and PDF describe probability distributions from different perspectives, but they are not interchangeable.

The CDF tracks accumulated probability. It tells how much probability lies at or below a given value and always exists for any random variable.

The PMF applies only to discrete random variables. It assigns probabilities to individual values, and the CDF is obtained by summing these probabilities up to a point.

The PDF applies only to continuous random variables. It describes how densely probability is spread, and the CDF is obtained by accumulating this density over an interval.

Because the CDF works in all cases, it serves as the most general representation of a probability distribution, with the PMF and PDF appearing as special cases derived from it.

Visual Interpretation


The cumulative distribution function can be understood visually as probability accumulating along the number line.

Starting from the far left, the CDF begins at zero and increases as more possible values are included. In discrete cases, this accumulation appears as upward jumps. In continuous cases, it appears as a smooth rising curve. Mixed distributions show both behaviors together.

Reading the CDF from left to right shows how probability builds up. At any point, the height of the curve represents how much of the total probability lies at or below that value.

Common Mistakes


The cumulative distribution function is often misunderstood because it looks simple but encodes a lot of structure.

A common mistake is confusing the CDF with a density or mass function. The CDF does not describe how much probability sits *at* a point, but how much has accumulated *up to* that point.

Another frequent error is forgetting that the CDF is cumulative. Interpreting its value as a probability of an exact outcome leads to incorrect conclusions, especially in continuous cases.

In discrete distributions, jumps in the CDF are sometimes misread as arbitrary features. Each jump has a precise meaning: its size equals the probability assigned to that value.

Finally, it is easy to forget that every valid CDF must satisfy basic properties such as monotonicity and proper limits. Violating these properties means the function cannot represent a probability distribution.

Why the CDF Matters


The cumulative distribution function fully characterizes the distribution of a random variable. Knowing the CDF is enough to recover all probability statements about the variable.

It provides a single framework that works for discrete, continuous, and mixed distributions. This makes it a central object in probability theory and a natural bridge between different types of models.

Many important concepts rely directly on the CDF, including quantiles, medians, percentiles, and probability intervals. In both theory and applications, the CDF is often the most convenient way to reason about probabilities.

Connections to Other Probability Concepts


    The CDF ties together several core ideas in probability.

  • Events appear as statements of the form XlexX le x.
  • Random variables provide the numerical structure the CDF describes.
  • PMF and PDF are specific ways probability is distributed, both derived from the CDF.
  • Probability axioms guarantee the basic properties of the CDF.
  • Quantiles and percentiles are defined by inverting the CDF.
  • Expectation and variance can be expressed using the CDF.

  • Through these connections, the CDF serves as a unifying object that links foundational probability concepts with practical calculations.