Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools

Probability Distributions




Modeling Random Phenomena


Probability is about uncertainty, but uncertainty alone is not enough to work mathematically. To reason, compare, and make predictions, we need a way to describe how uncertainty is structured. Probability distributions provide that structure.

Distributions sit at the core of probability theory and statistics. They organize random behavior into well-defined mathematical forms, allowing different random phenomena to be compared, analyzed, and modeled in a unified way. Whether outcomes are counted or measured, rare or frequent, symmetric or skewed, distributions give probability its shape.

This page serves as a conceptual map of probability distributions. It explains how distributions fit into probability theory, how they relate to random variables, and how the major categories of distributions are organized. Specific distribution families and formulas are explored on their own pages; here, the focus is on the ideas that tie them all together.





What a Probability Distribution Is


A probability distribution is a mathematical model that assigns probabilities to the possible values of a random variable, thereby encoding the structure of uncertainty associated with a random process. It provides the complete probabilistic characterization of a random phenomenon, specifying not just what outcomes can occur, but how probability is allocated across them.

Formally, given a probability space (Ω,F,P)(\Omega, \mathcal{F}, \mathbb{P}) and a random variable X:ΩRX : \Omega \to \mathbb{R}, the probability distribution of XX is the probability measure induced by XX on the measurable subsets of R\mathbb{R}, defined by

PX(A)=P(XA)\mathbb{P}_X(A) = \mathbb{P}(X \in A)


for all measurable sets AA. This induced measure encapsulates all probabilistic information about XX and serves as the primary object of analysis in statistical inference and probabilistic modeling.

The relationship between random variable and distribution is foundational. A random variable is a function that maps outcomes from the sample space to real numbers—it translates abstract events into measurable quantities. The distribution captures how probability flows through this mapping. While the random variable provides the mechanism for quantification, the distribution describes the probabilistic behavior of those quantities. Two random variables defined on different probability spaces may share the same distribution if they exhibit identical probabilistic structure, making the distribution the essential mathematical object for analysis.

The distribution functions as a model that captures the regularities of randomness in quantitative form. It abstracts from individual realizations and describes the law governing the occurrence of outcomes. Distributions provide the foundation for defining expectation, variance, moments, and limit theorems, forming the central organizing concept of probability theory.


Useful Notation


    Throughout this page, we use standard probability notation:

  • Ω\Omega — the sample space, the set of all possible outcomes of a random experiment
  • F\mathcal{F} — a σ-algebra of events, the collection of measurable subsets of Ω\Omega
  • P\mathbb{P} — a probability measure on (Ω,F)(\Omega, \mathcal{F}), assigning probabilities to events
  • X:ΩRX : \Omega \to \mathbb{R} — a random variable, a measurable function mapping outcomes to real numbers
  • PX\mathbb{P}_X — the probability distribution of XX, the induced measure on R\mathbb{R}
  • AA — a measurable set, typically a subset of R\mathbb{R}

  • The triple (Ω,F,P)(\Omega, \mathcal{F}, \mathbb{P}) is called a probability space and provides the foundational structure for all probability theory.

    See All Probability Symbols and Notations


What All Distributions Have in Common


Despite their diversity, all probability distributions share fundamental structural properties.

Every distribution assigns probabilities through a probability function—either a probability mass function (PMF) for discrete variables or a probability density function (PDF) for continuous variables. These functions encode how probability is distributed across the support.

Every distribution has a cumulative distribution function (CDF), defined as FX(x)=P(Xx)F_X(x) = \mathbb{P}(X \leq x). The CDF completely characterizes the distribution and exists for every random variable, whether discrete or continuous.

The support is the set of values where the distribution assigns positive probability or density. It defines what outcomes are possible under the model.

Distributions are determined by parameters—numerical constants that specify a particular member of a distribution family. Parameters control location (such as μ\mu), scale (such as σ\sigma), or shape.

All distributions have an expected value (mean), provided the integral or sum E[X]\mathbb{E}[X] exists and is finite.

Similarly, variance Var(X)=E[(Xμ)2]\text{Var}(X) = \mathbb{E}[(X - \mu)^2] measures dispersion when it exists.

Not all distributions have finite moments—the Cauchy distribution, for instance, has no finite mean.

Distributions can be characterized by moment generating functions or characteristic functions, which encode all moments when they exist and provide tools for deriving distributional properties.

Fundamental Properties and Components of Probability Distributions

ComponentDescriptionExpressionScopeRole
Probability Function (PMF/PDF)Assigns probabilities across the support via mass (discrete) or density (continuous) functions.f(x)f(x)Discrete (PMF) or Continuous (PDF)Encodes how probability is distributed across the support.
Cumulative Distribution Function (CDF)The probability that a random variable is less than or equal to a specific value.FX(x)=P(Xx)F_X(x) = P(X \leq x)Discrete and ContinuousCompletely characterizes the distribution.
SupportThe set of values where the distribution assigns positive probability or density.{xf(x)>0}\{x \mid f(x) > 0\}Discrete and ContinuousDefines possible outcomes under the model.
ParametersNumerical constants that specify a particular member of a distribution family.μ,σ\mu, \sigmaDiscrete and ContinuousControl location, scale, or shape of the distribution.
Expected Value (Mean)The long-term average value of the random variable.E[X]E[X]Discrete and ContinuousDetermines the center of the distribution if the sum/integral is finite.
VarianceA measure of the dispersion or spread of the distribution.Var(X)=E[(Xμ)2]Var(X) = E[(X - \mu)^2]Discrete and ContinuousMeasures dispersion when it exists.
Moment Generating FunctionsFunctions used to encode all moments and derive distributional properties.MX(t)=E[etX]M_X(t) = E[e^{tX}]Discrete and ContinuousProvides tools for deriving properties and encoding moments.

2 Basic Types of Distributions

Probability distributions are mathematical models that quantify how likely different outcomes are when dealing with uncertainty and randomness. These powerful tools allow us to systematically describe and predict the behavior of random phenomena across countless real-world scenarios. They fall into two fundamental categories: discrete distributions deal with countable outcomes (like number of successes, coin flips, or defective items), while continuous distributions handle measurable quantities that can take any value within a range (like height, time, or temperature). The key difference lies in whether you can list all possible outcomes (discrete) or whether outcomes form an unbroken continuum (continuous).

2 Basic Types of Distributions Discrete Distribution Countable Outcomes Examples: • Coin flips • Dice rolls • Number of defects Continuous Distribution Measurable Values Examples: • Height • Temperature • Time • Weight Discrete = Countable | Continuous = Unbroken Range

Within each of these two fundamental categories, distributions are further divided into several groups based on the specific scenarios they model. Discrete distributions branch into various types designed for different counting situations—from equal-probability outcomes to success-trial patterns to rare event modeling. Continuous distributions similarly divide into specialized forms that describe different real-world phenomena, ranging from uniform spreads across intervals to bell-shaped patterns to asymmetric waiting-time behaviors.

Probability Distributions

Discrete Distributions

Discrete Uniform:
Equal probability for finite outcomes
Binomial:
Successes in n trials with probability p each
Geometric:
Trials until first success (probability p)
Negative Binomial:
Trials until r-th success (generalization of geometric)
Hypergeometric:
Sampling without replacement from finite population
Poisson:
Rare events over time interval (rate λ)
VS

Continuous Distributions

Uniform:
Equal likelihood over interval [a,b]
Normal:
Bell curve with mean μ and variance σ²
Exponential:
Waiting time between events (rate λ)
Gamma:
Waiting time until k-th event (shape, rate)
Beta:
Random proportions on [0,1] (shape parameters α,β)
Chi-Square:
Sum of squared normal variables (degrees of freedom ν)
Understanding these distributions is essential for statistical modeling, hypothesis testing, and making predictions about uncertain events. Each distribution has specific scenarios where it naturally applies - choosing the right one depends on the nature of your data and the underlying process generating it. Master these fundamentals, and you'll have the foundation for advanced statistical analysis and data science applications.

Discrete Distributions

Reminder:Random Variable is a function that maps each fundamental outcome of a probabilistic experiment to a real number.

Discrete Random Variable is a random variable whose set of attainable values is either a finite collection or a countably infinite list.
And finally, the term discrete distribution simply refers to the probability distribution that assigns probabilities to each possible value of a discrete random variable.
There are six classic discrete distributions—uniform, binomial, geometric, Poisson, negative binomial and hypergeometric—each distinguished by the structure of trials or sampling they model (e.g. fixed number of trials vs. waiting time, constant‐rate events, or draws with/without replacement). They differ in their support and key parameters—such as the number of trials nn, success probability pp, event rate λ\lambda, target successes rr, or population size NN.


Common Discrete Distributions

TypeDescriptionExamples
Discrete UniformEvery outcome in a finite set has exactly the same probability—complete symmetry across the support.Roll of a fair six-sided die; drawing one card at random from a deck
BinomialCounts the number of successes in a fixed number nn of independent Bernoulli(pp) trials; probability varies with the count of successes.Number of heads in 10 coin flips; number of defective items in 20 manufactured parts
GeometricMeasures how many trials are needed until the first success in independent Bernoulli(pp) trials; has the memoryless property.Tossing a coin repeatedly until the first head appears; number of attempts before a free-throw is made
Negative BinomialGeneralizes the geometric to count trials until the rrth success in Bernoulli(pp) trials; allows modeling multiple required successes.Number of coin tosses until 5 heads occur; calls made until 3 sales are closed
HypergeometricCounts successes in a sample drawn without replacement from a finite population; trials are dependent and probabilities change with each draw.Drawing 5 cards from a 52-card deck and counting aces; selecting defective items from a batch without replacement
PoissonModels the count of rare, independent events occurring in a fixed interval at average rate λ\lambda; arises as a limit of the binomial with small pp.Number of emails received per hour; calls arriving at a call center per minute

Understanding which distribution fits a given scenario is key to solving probability problems correctly. Each type has its own signature—whether you're counting successes in a fixed number of trials, waiting for events to happen, or sampling from a limited population. Just as importantly, each distribution comes with ready-made formulas for mean and variance that would be extremely difficult (or impossible) to derive from scratch every time you need them. Recognizing these patterns lets you pick the right tool and set up your calculations with confidence.
Learn More

Continuous Distributions

Reminder:Random Variable is a function that maps each fundamental outcome of a probabilistic experiment to a real number.

A Continuous Random Variable is a random variable that can take on any value within an interval or collection of intervals on the real line—its possible values form an uncountable set. Instead of assigning probabilities to individual points, a continuous distribution uses a probability density function (PDF) to describe the relative likelihood of values, with probabilities determined by integrating the density over intervals.
Common continuous distributions include the uniform, normal, exponential, gamma, beta, and chi-square distributions, among others—each characterized by different underlying phenomena they model, such as equal likelihood over an interval, bell-shaped symmetry around a mean, waiting times between events, or distributions arising from transformations of other random variables. They vary in their support (the range of possible values) and the parameters that control their shape and behavior.

Common Continuous Distributions

TypeDescriptionExamples
Continuous UniformEvery value in the interval [a,b][a,b] has equal probability density—complete symmetry across the support.Random angle between 0° and 360°; arrival time uniformly distributed within an hour
Normal (Gaussian)Bell-shaped distribution symmetric around mean μ\mu with spread controlled by variance σ2\sigma^2; arises from the Central Limit Theorem.Heights of adult humans; measurement errors in scientific instruments; test scores
ExponentialModels the waiting time between independent events occurring at constant rate λ\lambda; has the memoryless property.Time between arrivals at a queue; lifespan of a radioactive particle; time until next phone call
GammaGeneralizes the exponential to model the waiting time until the kkth event at rate λ\lambda; uses shape parameter kk and rate parameter λ\lambda.Time until kk customers arrive; total rainfall accumulation; time to complete multiple tasks
BetaModels random proportions or probabilities on [0,1][0,1] with shape parameters α\alpha and β\beta; flexible family for bounded continuous variables.Proportion of defective items in a batch; click-through rate; task completion percentage
Chi-SquareDistribution of the sum of ν\nu squared independent standard normal variables; used in hypothesis testing and confidence intervals.Goodness-of-fit test statistic; sample variance of normal data; test of independence in contingency tables
Identifying the appropriate continuous distribution for a problem is essential for accurate probabilistic modeling and analysis. The key lies in recognizing the underlying structure—whether you're dealing with measurements that cluster symmetrically around a center, modeling time until an event occurs, working with proportions bounded between zero and one, or analyzing variables constructed from other random quantities. Each distribution provides established formulas for expected values, variances, and other properties that capture its essential behavior, sparing you from complex integrations each time. Mastering these characteristic patterns enables you to select the right framework and approach your analysis with clarity.
Learn More

Probability Function

In probability theory, every random process is ultimately described by a rule that assigns probabilities to possible outcomes. This rule is known as a probability function.

A probability function is the basic rule that assigns probabilities to the outcomes of a random experiment. It translates uncertainty into a precise mathematical structure by specifying how likely each possible value of a random variable is.

This concept exists independently of any named distribution. In simple or irregular situations, the probability function may not follow a standard form, but it still governs how probabilities are assigned. When the underlying rule has a clear, organized pattern, it becomes possible to describe it through familiar distribution families such as Binomial, Poisson, or Normal. In those cases, the probability function takes on specific forms: a probability mass function for discrete variables or a probability density function for continuous variables.

For discrete random variables, this rule appears as a probability mass function (PMF), which specifies the probability of each individual value. For continuous variables, the rule takes the form of a probability density function (PDF), which describes how probability is distributed across an interval. Although they behave differently, both PMFs and PDFs serve the same fundamental role: they encode the entire structure of the random phenomenon.
Probability Function: The General Concept Probability Function Mathematical rule that assigns probabilities Specific implementations based on variable type For discrete variables For continuous variables PMF Probability Mass Function For Discrete Random Variables P(X = x) Gives exact probability for each specific outcome value Examples: coin flips, dice rolls, binomial PDF Probability Density Function For Continuous Random Variables f(x) Gives probability density; integrate over intervals for probabilities Examples: heights, temperatures, normal Both are specific forms of the general probability function concept

Understanding probability function patterns is crucial because every distribution (does not matter if discrete or continuous) is based directly on them. If you want to explore how PMFs and PDFs work, how they differ, and why they matter, see the full explanation on the dedicated Probability Function page.
Learn More

Cumulative Distribution Function (CDF)

While the probability function (PMF or PDF) focuses on the likelihood of specific values or densities, the Cumulative Distribution Function (CDF) captures the accumulation of probability across the variable's range. The CDF, typically denoted as FX(x)F_X(x), defines the probability that a random variable XX takes a value less than or equal to xx:

FX(x)=P(Xx)F_X(x) = \mathbb{P}(X \leq x)


This function serves as a complete characterization of any random variable, whether discrete or continuous. In the discrete case, the CDF forms a step function that increases in jumps at each outcome. In the continuous case, it appears as a smooth, continuous curve that represents the integral of the density function from infty-infty to xx.

The CDF is universally applicable and mathematically powerful. It allows us to easily calculate the probability of intervals—simply by taking the difference F(b)F(a)F(b) - F(a)—and is the basis for defining Quantiles and Percentiles. Because probabilities cannot be negative and total probability must sum to one, every CDF fundamentally starts at 0 (at infty-infty) and monotonically increases to 1 (at +infty+infty).

Why Distributions Matter


Probability distributions form the conceptual foundation of quantitative reasoning under uncertainty.

Distributions provide the mathematical framework for modeling random phenomena. By identifying the appropriate distribution, we translate real-world randomness into a precise mathematical structure amenable to rigorous analysis. The distribution captures the essential probabilistic behavior of a system while abstracting away irrelevant details.

In statistical inference, distributions enable us to draw conclusions from data. Sample statistics follow known distributions, allowing us to estimate population parameters, test hypotheses, and construct confidence intervals. The sampling distribution of estimators derives directly from the underlying probability distributions of the data.

For prediction, distributions quantify uncertainty about future outcomes. Rather than producing single-valued forecasts, distributional models provide probabilistic statements—the likelihood of various scenarios, credible intervals, and risk assessments. This probabilistic framework is essential for decision-making under uncertainty.

Distributions bridge probability theory and data science. They connect theoretical probability spaces to empirical observations, enabling the development of statistical methods, machine learning algorithms, and stochastic models. Without distributions, the mathematical treatment of randomness and data analysis as we know it would not exist.