Discrete Distributions

Bernoulli Trial

6 Distribution Types: how to decide

Uniform Discrete Distribution

Binomial Distribution

Geometric Distribution

Negative Binomial Distribution

Hypergeometric Distribution

Poisson Distribution

Discrete Distributions

Discrete distributions are probability models for random variables that can take on a countable set of values—typically integers or a finite set of outcomes. Unlike continuous distributions, which describe phenomena like heights or temperatures that can take any value within a range, discrete distributions characterize scenarios with distinct, separable outcomes: the number of successes in a series of trials, the count of events in a time interval, or selections from a finite population.

Understanding discrete distributions is fundamental to probability theory and problem-solving. Each distribution arises from a specific probabilistic mechanism—whether sampling with or without replacement, counting trials until an event occurs, or modeling rare occurrences. Recognizing these underlying structures allows you to match problems to their appropriate models.

The distinctions matter mathematically.The simplest case—the discrete uniform distribution—assigns equal probability to each outcome in a finite set, serving as the foundation for understanding more complex models. At the other end, the negative binomial distribution generalizes the geometric case by counting trials until a specified number of successes rather than just the first. A binomial distribution assumes a fixed number of independent trials, while a geometric distribution counts trials until the first success—superficially similar setups that yield entirely different probability mass functions, moments, and analytical properties. Misidentifying the mechanism leads to incorrect calculations and invalid conclusions.
The Poisson distribution, meanwhile, models the occurrence of rare events over a continuous interval—time, space, or volume—making it distinct from the trial-based counting distributions.

This page systematically presents six fundamental discrete distributions, detailing their support, parameters, probability functions, and key statistical properties. Mastering these models equips you to tackle a wide range of probabilistic questions with precision and confidence.

Bernoulli Trial

Understanding the Bernoulli Trial: Two Perspectives

There are two ways to view a Bernoulli trial:

1. As a single experiment
2. As a distribution

In this section, we will focus on the Bernoulli trial as a concept, not as a standalone probability distribution. We won’t be analyzing Bernoulli as a separate type of distribution, but rather clarifying how it fits into the broader picture.

Bernoulli Trial → Single Experiment

A Bernoulli trial is a single random experiment with exactly two possible outcomes:

* Success (1) with probability (

p

)
* Failure (0) with probability (

1 - p

)

This setup makes it the most basic probabilistic experiment. A classic example is a single coin flip, where heads is defined as success. The outcome is binary, and the probabilities are fixed.

Bernoulli Trial as a Building Block for Discrete Distributions Models

What makes the Bernoulli trial so fundamental is that it forms the core mechanism behind many important discrete probability distributions. Once you understand the behavior of a single Bernoulli trial, you can extend it to more complex models by simply repeating the trial under certain rules.

Here’s how it builds into larger structures:

* Binomial distribution: Repeats the Bernoulli trial ( n ) times independently and counts how many successes occur.
* Geometric distribution: Repeats the trial until the first success.
* Negative binomial distribution: Repeats until the r-th success.
* Even the hypergeometric and some Markov models borrow the concept of binary outcomes, though with modified assumptions (like dependence or sampling without replacement).

This modularity makes the Bernoulli trial a conceptual building block — much like a “unit of randomness” — that helps us understand how randomness scales when we repeat simple actions under defined conditions.

The power of the Bernoulli trial is not in its complexity — it is in its ability to scale up into powerful probabilistic models that describe everything from coin tosses to quality control in manufacturing.

6 Distribution Types: how to decide

Discrete Distributions Occurrence Matrix

Distribution	Equal Probabilities	Fixed n, Independent Trials	Without Replacement	Infinite Trials	Until First Success	Until r-th Success	Constant Rate (λ)
Discrete Uniform	✓	✗	✗	✗	✗	✗	✗
Binomial	✗	✓	✗	✗	✗	✗	✗
Hypergeometric	✗	✗	✓	✗	✗	✗	✗
Geometric	✗	✓	✗	✓	✓	✗	✗
Negative Binomial	✗	✓	✗	✓	✗	✓	✗
Poisson	✗	✗	✗	✓	✗	✗	✓

Uniform Discrete Distribution

Discrete Uniform Distribution

⋅ Finite set of

n

equally likely outcomes.
⋅ Each outcome has the same probability.

⋅ Random variable X takes values uniformly from the set of integers {a, a+1, ..., b}.

⋅ All values between

a

and

b

are integers.
⋅ Support:

\{a, a+1, \ldots, b\}

where

b \geq a

.
⋅ Probability function:

P(X = k) = \dfrac{1}{b - a + 1}

.
⋅ Parameters:

a, b \in \mathbb{Z},\ a \leq b

.
⋅ Notation:

X \sim \text{Unif}(a, b)

Checklist for Identifying a Discrete Uniform Distribution

✔ All values in the range are equally likely.
✔ The variable takes on a finite set of integer values.
✔

X

is defined over a fixed range from a to b (inclusive).
✔ No value is favored over another.

Notations Used:

X \sim \text{Unif}(a, b)

X \sim \text{DU}(a, b)

— distribution of the random variable.

DiscreteUniform(a, b)

— used to denote the distribution itself (not the random variable).

U(a, b)

— also used, though it can refer to either discrete or continuous; context is important.

P(X = k) = \frac{1}{b - a + 1}, \quad \text{for } k = a, a+1, \dots, b

— probability mass function

See All Probability Symbols and Notations →

Parameters of Uniform Discrete Distribution

a

: the smallest integer in the range

b

: the largest integer in the range

The uniform discrete distribution assigns equal probability to each integer between

a

and

b

, inclusive. The values must be equally spaced and finite in number. The parameters define the range — once

a

and

b

are set, every integer in that closed interval has probability

\frac{1}{b - a + 1}

.
This distribution is used when there's no reason to favor any outcome over another — every value is equally likely by design.

The probability mass function (PMF) of a discrete uniform distribution is given by:

P(X = x) = \frac{1}{b - a + 1} = \frac{1}{n}, \quad x \in \{x_1, x_2, \dots, x_n\}

Where :

a

= lower bound (integer)

b

= upper bound (integer)

𝑛=b−a+1

is total number of possible values

Intuition Behind the Formula

Uniformity: The term "uniform" implies that each outcome is equally likely. That is, no single value of the random variable is preferred over another. This is the key feature of a uniform distribution.

Support (Range of the Random Variable):
* The random variable

X

can take on

n = b - a + 1

distinct values:

x_1, x_2, \ldots, x_n

.
* These values could be consecutive integers (like

1, 2, 3, \ldots, n

) or any set of

n

distinct values.
* The range or support is thus a finite, countable set.

Logic Behind the Formula:

The total probability must sum to 1:

\sum_{i=1}^n P(X = x_i) = 1

Since all probabilities are equal:

n \cdot \frac{1}{n} = (b - a + 1) \cdot \frac{1}{b - a + 1} = 1

This makes the individual probability of each outcome

\frac{1}{n} = \frac{1}{b - a + 1}

Practical Example

Suppose you roll a fair six-sided die. The possible outcomes are

\{1, 2, 3, 4, 5, 6\}

, and the probability of each face is:

P(X = x) = \frac{1}{6} = \frac{1}{6 - 1 + 1}, \quad x = 1, 2, 3, 4, 5, 6

Each face has an equal and independent chance of appearing.

Uniform Discrete Distribution

Property	Uniform Discrete Distribution
Description	Models experiments where each outcome is equally likely (e.g., rolling a fair die, random selection from a finite set)
Support (Domain)	X ∈ {a, a+1, a+2, ..., b}
Finite or Infinite?	Finite
Bounds/Range	[a, b] where a, b are integers and a ≤ b
Parameters	a (minimum value), b (maximum value)
Number of trials known/fixed beforehand?	Not applicable (single selection)
Selection Property/Mechanism	All selections are equal - no outcome has special meaning
PMF (Probability Mass Function)	P(X = k) = 1/(b - a + 1) for k ∈ {a, a+1, ..., b}
CDF (Cumulative Distribution Function)	P(X ≤ k) = (k - a + 1)/(b - a + 1) for k ∈ {a, a+1, ..., b}
Mean	E[X] = (a + b)/2
Variance	Var(X) = ((b - a + 1)² - 1)/12
Standard Deviation	σ = √(((b - a + 1)² - 1)/12)

Use distributions calculator →

Binomial Distribution

⋅ Fixed number of Bernoulli trials:

n

⋅ Each trial is independent.
⋅ Each trial has two outcomes: success or failure.
⋅ Success probability is constant:

p

.
⋅ Failure probability:

q = 1 - p

⋅ Random variable X counts successes.

⋅ Distribution is over:

\{0, 1, \ldots, n\}

⋅ Probability function:

P(X = k) = \binom{n}{k} p^k q^{n - k}

⋅ Parameters:

n \in \mathbb{N},\ 0 < p < 1

⋅ Notation:

X \sim \text{Bin}(n, p)

Checklist for Identifying a Binomial Distribution

✔ Repeating the same Bernoulli trial independently (each trial does not depend on the others).
✔ The trial is repeated exactly n times.
✔

X

is defined as the number of successes out of the total trials.

Notations Used:

X \sim \text{Bin}(n, p)

X \sim \text{B}(n, p)

— distribution of the random variable.

Binomial(n, p)

— used to denote the distribution itself (not the random variable).

B(n,p)

— occasionally used in theoretical or formal contexts (less common).

P(X = k) = \binom{n}{k} p^k (1 - p)^{n - k}

— probability mass function

See All Probability Symbols and Notations →

Parameters of Binomial Distribution

𝑛

: fixed number of independent trials;

𝑝

: probability of success in each trial;

This distribution models the number of successes when repeating the same binary experiment

𝑛

times under identical conditions. The two parameters fully describe the setup:

𝑛

gives the structure — how many attempts, and

𝑝

defines the behavior of each — what chance success has.
It’s useful to compare with the negative binomial, where instead of fixing how many trials you run, you fix how many successes you want and ask: how many trials will it take? Both deal with repeated binary outcomes, but what’s held constant — trials vs. successes — flips.

The probability mass function (PMF) of a binomial distribution is given by:

P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0, 1, 2, \ldots, n

where

\binom{n}{k} = \frac{n!}{k!(n-k)!}

is the binomial coefficient.

Intuition Behind the Formula

* Fixed Number of Trials: The binomial distribution models the number of successes in

n

independent trials, where each trial has only two possible outcomes: success or failure.

* Parameters:
*

n

: The number of independent trials
*

p

: The probability of success on each trial
*

1-p

: The probability of failure on each trial (often denoted as

q

)

* Support (Range of the Random Variable):
* The random variable

X

can take on values from

0

n

(inclusive).
* These represent the possible number of successes:

0, 1, 2, \ldots, n

.
* The support is thus a finite set of

n+1

non-negative integers.

* Logic Behind the Formula:
*

\binom{n}{k}

: The number of ways to choose

k

successes from

n

trials
*

p^k

: The probability of getting exactly

k

successes
*

(1-p)^{n-k}

: The probability of getting exactly

n-k

failures
* The total probability sums to 1:

\sum_{k=0}^{n} P(X = k) = \sum_{k=0}^{n} \binom{n}{k} p^k (1-p)^{n-k} = 1

* This follows from the binomial theorem:

(p + (1-p))^n = 1^n = 1

Practical Example

Suppose you flip a fair coin

n = 5

times, where the probability of heads (success) is

p = 0.5

. The probability of getting exactly

k = 3

heads is:

P(X = 3) = \binom{5}{3} (0.5)^3 (0.5)^{5-3} = 10 \cdot 0.125 \cdot 0.25 = 0.3125

This means there's a 31.25% chance of getting exactly 3 heads in 5 coin flips.

The possible outcomes range from

k = 0

(no heads) to

k = 5

(all heads), with probabilities determined by the formula above.

Binomial Distribution

Property	Binomial Distribution
Description	Models the number of successes in a fixed number of independent trials, each with the same probability of success (e.g., number of heads in 10 coin flips)
Support (Domain)	X ∈ {0, 1, 2, ..., n}
Finite or Infinite?	Finite
Bounds/Range	[0, n] where n is a positive integer
Parameters	n (number of trials), p (probability of success on each trial), where 0 ≤ p ≤ 1
Number of trials known/fixed beforehand?	Yes, n is fixed before the experiment
Selection Property/Mechanism	Fixed number of independent trials; counting total number of successes; each trial has binary outcome (success/failure)
PMF (Probability Mass Function)	P(X = k) = C(n,k) × p^k × (1-p)^(n-k) for k ∈ {0, 1, ..., n}
CDF (Cumulative Distribution Function)	P(X ≤ k) = Σ(i=0 to k) C(n,i) × p^i × (1-p)^(n-i)
Mean	E[X] = np
Variance	Var(X) = np(1-p)
Standard Deviation	σ = √(np(1-p))

Use distributions calculator →

Use binomial distributions cummulative table →

Geometric Distribution

⋅ Sequence of independent Bernoulli trials.
⋅ Each trial has two outcomes: success or failure.
⋅ Success probability is constant:

p

.
⋅ Failure probability:

q = 1 - p

⋅ Random variable X counts number of trials until the first success.

⋅ Support:

\{1, 2, \ldots\}

⋅ Probability function:

P(X = k) = (1 - p)^{k - 1} \cdot p

⋅ Parameters:

0 < p < 1

⋅ Notation:

X \sim \text{Geom}(p)

Checklist for Identifying a Geometric Distribution

✔ Repeating Bernoulli trials independently with constant probability.
✔ No limit on the number of trials — keep repeating until success.
✔

X

is defined as the total number of trials up to and including the first success.

Notations Used:

X \sim \text{Geom}(p)

X \sim \text{Geometric}(p)

— distribution of the random variable.

Geom(p)

— used to denote the distribution itself (not the random variable).

G(p)

— less common shorthand in some texts or software contexts.

P(X = k) = (1 - p)^{k - 1} p, \quad \text{for } k = 1, 2, 3, \dots

— probability mass function

See All Probability Symbols and Notations →

Parameters of Geometric Distribution

𝑝

: probability of success on a single trial, with

0<𝑝≤1

The geometric distribution models the number of trials needed to get the first success in a sequence of independent Bernoulli trials.
There's only one parameter —

𝑝

, the chance of success each time — which completely determines the shape of the distribution.
The outcomes are positive integers:

1,2,3,…

where each value represents the trial number on which success first occurs.

The probability mass function (PMF) of a geometric distribution is given by:

P(X = k) = (1-p)^{k-1} p, \quad k = 1, 2, 3, \ldots

Intuition Behind the Formula

* First Success: The geometric distribution models the number of trials needed to get the first success in a sequence of independent Bernoulli trials.

* Parameters:
*

p

: The probability of success on each trial
*

1-p

: The probability of failure on each trial (often denoted as

q

)

* Support (Range of the Random Variable):
* The random variable

X

can take on values

1, 2, 3, \ldots

(all positive integers).
*

X = k

means the first success occurs on the

k

-th trial.
* The support is thus a countably infinite set.

* Logic Behind the Formula:
*

(1-p)^{k-1}

: The probability of getting

k-1

failures before the first success
*

p

: The probability of success on the

k

-th trial
* The total probability sums to 1:

\sum_{k=1}^{\infty} P(X = k) = \sum_{k=1}^{\infty} (1-p)^{k-1} p = p \sum_{k=1}^{\infty} (1-p)^{k-1} = p \cdot \frac{1}{1-(1-p)} = p \cdot \frac{1}{p} = 1

* This uses the geometric series formula:

\sum_{k=0}^{\infty} r^k = \frac{1}{1-r}

for

|r| < 1

Practical Example

Suppose you're rolling a fair six-sided die until you get a 6. The probability of rolling a 6 is

p = \frac{1}{6}

. The probability that you need exactly

k = 4

rolls to get your first 6 is:

P(X = 4) = \left(\frac{5}{6}\right)^{4-1} \cdot \frac{1}{6} = \left(\frac{5}{6}\right)^{3} \cdot \frac{1}{6} \approx 0.096

This means there's about a 9.6% chance that you'll need exactly 4 rolls to get your first 6.

Geometric Distribution

Property	Geometric Distribution
Description	Models the number of trials until the first success in a sequence of independent trials, each with the same probability of success (e.g., number of coin flips until first heads)
Support (Domain)	X ∈ {1, 2, 3, ...}
Finite or Infinite?	Infinite
Bounds/Range	[1, ∞)
Parameters	p (probability of success on each trial), where 0 < p ≤ 1
Number of trials known/fixed beforehand?	No, trials continue until the first success occurs
Selection Property/Mechanism	Variable number of independent trials; counting trials until first success; each trial has binary outcome (success/failure); memoryless property
PMF (Probability Mass Function)	P(X = k) = (1-p)^k-1 × p for k ∈ {1, 2, 3, ...}
CDF (Cumulative Distribution Function)	P(X ≤ k) = 1 - (1-p)^k for k ∈ {1, 2, 3, ...}
Mean	E[X] = 1/p
Variance	Var(X) = (1-p)/p²
Standard Deviation	σ = √((1-p)/p²) = √(1-p)/p

Use distributions calculator →

Negative Binomial Distribution

⋅ Sequence of independent Bernoulli trials.
⋅ Each trial has two outcomes: success or failure.
⋅ Success probability is constant:

p

.
⋅ Failure probability:

q = 1 - p

⋅ Random variable X counts the number of trials needed to get r successes.

⋅ Trials are independent and identically distributed.
⋅ Support:

\{r, r+1, r+2, \ldots\}

.
⋅ Probability function:

P(X = k) = \binom{k - 1}{r - 1} p^r q^{k - r}

.
⋅ Parameters:

r \in \mathbb{N},\ 0 < p < 1

.
⋅ Notation:

X \sim \text{NegBin}(r, p)

Checklist for Identifying a Negative Binomial Distribution

✔ Repeating the same Bernoulli trial independently.
✔ Success probability remains constant across trials.
✔ X is defined as the number of trials until the r-th success (inclusive).

Notations Used:

X \sim \text{NegBin}(r, p)

X \sim \text{NB}(r, p)

— distribution of the random variable.

NegativeBinomial(r, p)

— used to denote the distribution itself (not the random variable).

NB(r, p)

— common shorthand, especially in statistical software.

P(X = k) = \binom{k - 1}{r - 1} p^r (1 - p)^{k - r}, \quad \text{for } k = r, r+1, r+2, \dots

— probability mass function (trials until

r

-th success)

See All Probability Symbols and Notations →

Parameters of Negative Binomial Distribution

𝑟

: number of successes to achieve (a positive integer)

𝑝

: probability of success in each trial, with

0<𝑝≤1

This distribution models the number of trials needed to observe

𝑟

successes, assuming each trial is independent and has the same probability

𝑝

of success.
The outcomes are integers

𝑟

𝑟+1

𝑟+2

,…, since at least

𝑟

trials are needed.

𝑟

controls the target (how many successes), and

𝑝

controls the chance of achieving each one — together, they define how spread out or concentrated the distribution is.

The probability mass function (PMF) of a negative binomial distribution is given by:

P(X = k) = \binom{k-1}{r-1} p^r (1-p)^{k-r}, \quad k = r, r+1, r+2, \ldots

where

\binom{k-1}{r-1} = \frac{(k-1)!}{(r-1)!(k-r)!}

is the binomial coefficient.

Intuition Behind the Formula

* Fixed Number of Successes: The negative binomial distribution models the number of trials needed to achieve exactly

r

successes in a sequence of independent Bernoulli trials.

* Parameters:
*

r

: The number of successes we want to achieve (a positive integer)
*

p

: The probability of success on each trial
*

1-p

: The probability of failure on each trial (often denoted as

q

)

* Support (Range of the Random Variable):
* The random variable

X

can take on values

r, r+1, r+2, \ldots

(integers starting from

r

).
*

X = k

means the

r

-th success occurs on the

k

-th trial.
* The support is thus a countably infinite set.

* Logic Behind the Formula:
*

\binom{k-1}{r-1}

: The number of ways to arrange

r-1

successes in the first

k-1

trials (the

k

-th trial must be the

r

-th success)
*

p^r

: The probability of getting exactly

r

successes
*

(1-p)^{k-r}

: The probability of getting exactly

k-r

failures
* The total probability sums to 1:

\sum_{k=r}^{\infty} P(X = k) = \sum_{k=r}^{\infty} \binom{k-1}{r-1} p^r (1-p)^{k-r} = 1

* This follows from the negative binomial series expansion.

Practical Example

Suppose you're flipping a coin until you get

r = 3

heads, where the probability of heads is

p = 0.5

. The probability that you need exactly

k = 6

flips to get your third head is:

P(X = 6) = \binom{6-1}{3-1} (0.5)^3 (0.5)^{6-3} = \binom{5}{2} (0.5)^3 (0.5)^3 = 10 \cdot 0.125 \cdot 0.125 = 0.15625

This means there's a 15.625% chance that you'll need exactly 6 flips to get your third head.

Note: The geometric distribution is a special case of the negative binomial distribution where

r = 1

Negative Binomial Distribution

Property	Negative Binomial Distribution
Description	Models the number of trials until r successes occur in a sequence of independent trials, each with the same probability of success (e.g., number of coin flips until 5th heads)
Support (Domain)	X ∈ {r, r+1, r+2, ...}
Finite or Infinite?	Infinite
Bounds/Range	[r, ∞) where r is a positive integer
Parameters	r (number of successes desired), p (probability of success on each trial), where 0 < p ≤ 1 and r is a positive integer
Number of trials known/fixed beforehand?	No, trials continue until r successes occur
Selection Property/Mechanism	Variable number of independent trials; counting trials until r^th success; each trial has binary outcome (success/failure); generalization of geometric distribution
PMF (Probability Mass Function)	P(X = k) = C(k-1, r-1) × p^r × (1-p)^k-r for k ∈ {r, r+1, r+2, ...}
CDF (Cumulative Distribution Function)	P(X ≤ k) = Σ(i=r to k) C(i-1, r-1) × p^r × (1-p)^i-r
Mean	E[X] = r/p
Variance	Var(X) = r(1-p)/p²
Standard Deviation	σ = √(r(1-p))/p

Use distributions calculator →

Hypergeometric Distribution

⋅ Population of size

N

contains two types: successes and failures.
⋅ Number of successes in the population:

K

.
⋅ Number of draws (without replacement):

n

⋅ Random variable X counts the number of successes in the sample.

⋅ Trials are not independent (sampling without replacement).
⋅ Support:

\{\max(0, n - (N - K)), \ldots, \min(n, K)\}

.
⋅ Probability function:

P(X = k) = \dfrac{\binom{K}{k} \binom{N - K}{n - k}}{\binom{N}{n}}

.
⋅ Parameters:

N, K, n \in \mathbb{N}

with

0 \leq K \leq N

0 \leq n \leq N

.
⋅ Notation:

X \sim \text{Hypergeometric}(N, K, n)

Checklist for Identifying a Hypergeometric Distribution

✔ Sampling is done without replacement from a finite population.
✔ The population has a fixed number of successes and failures.
✔ The number of draws is fixed in advance.
✔

X

is defined as the number of successes in the sample.

Notations Used:

X \sim \text{Hypergeometric}(N, K, n)

X \sim \text{Hyp}(N, K, n)

— distribution of the random variable.

Hypergeometric(N, K, n)

— used to denote the distribution itself (not the random variable).

H(N, K, n)

— occasionally used in compact form, especially in software or formulas.

P(X = k) = \frac{\binom{K}{k} \binom{N - K}{n - k}}{\binom{N}{n}}, \quad \text{for valid } k

— probability mass function

See All Probability Symbols and Notations →

Parameters of Hypergeometric Distribution

𝑁

: total population size

𝐾

: number of successes in the population

𝑛

: number of draws (without replacement), where

𝑛≤𝑁

The hypergeometric distribution models the number of successes in

𝑛

draws from a finite population of size

𝑁

that contains exactly

𝐾

successes, without replacement. Unlike the binomial, where each trial is independent, here each draw changes the probabilities — once an item is drawn, it doesn't go back. This dependency is what defines the distribution’s behavior.

The probability mass function (PMF) of a hypergeometric distribution is given by:

P(X = k) = \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}}, \quad k = \max(0, n-N+K), \ldots, \min(n, K)

where

\binom{a}{b} = \frac{a!}{b!(a-b)!}

is the binomial coefficient.

Intuition Behind the Formula

* Sampling Without Replacement: The hypergeometric distribution models the number of successes when drawing

n

items without replacement from a finite population of size

N

containing exactly

K

success items.

* Parameters:
*

N

: Total population size
*

K

: Number of success items in the population
*

n

: Number of draws (sample size)
*

N-K

: Number of failure items in the population

* Support (Range of the Random Variable):
* The random variable

X

can take on values from

\max(0, n-N+K)

\min(n, K)

.
*

X = k

means exactly

k

successes are drawn in the sample of size

n

.
* The lower bound ensures we don't draw more failures than available:

n-k \leq N-K

* The upper bound ensures we don't draw more successes than available:

k \leq K

and

k \leq n

* The support is thus a finite set of non-negative integers.

* Logic Behind the Formula:
*

\binom{K}{k}

: The number of ways to choose

k

successes from

K

available successes
*

\binom{N-K}{n-k}

: The number of ways to choose

n-k

failures from

N-K

available failures
*

\binom{N}{n}

: The total number of ways to choose

n

items from

N

items
* The total probability sums to 1:

\sum_{k=\max(0,n-N+K)}^{\min(n,K)} P(X = k) = \sum_{k=\max(0,n-N+K)}^{\min(n,K)} \frac{\binom{K}{k} \binom{N-K}{n-k}}{\binom{N}{n}} = 1

* This follows from Vandermonde's identity.

Practical Example

Suppose you have a deck of

N = 52

cards containing

K = 13

hearts. You draw

n = 5

cards without replacement. The probability of getting exactly

k = 2

hearts is:

P(X = 2) = \frac{\binom{13}{2} \binom{52-13}{5-2}}{\binom{52}{5}} = \frac{\binom{13}{2} \binom{39}{3}}{\binom{52}{5}} = \frac{78 \cdot 9139}{2598960} \approx 0.274

This means there's about a 27.4% chance of getting exactly 2 hearts when drawing 5 cards from a standard deck.

Note: When

N

is very large relative to

n

, the hypergeometric distribution approximates the binomial distribution with

p = \frac{K}{N}

Hypergeometric Distribution

Property	Hypergeometric Distribution
Description	Models the number of successes in a sample drawn without replacement from a finite population containing both successes and failures (e.g., drawing red balls from an urn without replacing them)
Support (Domain)	X ∈ {max(0, n-N+K), ..., min(n, K)}
Finite or Infinite?	Finite
Bounds/Range	[max(0, n-N+K), min(n, K)]
Parameters	N (population size), K (number of success states in population), n (number of draws), where N, K, n are positive integers with K ≤ N and n ≤ N
Number of trials known/fixed beforehand?	Yes, n is fixed before the experiment
Selection Property/Mechanism	Sampling without replacement from finite population; fixed number of draws; counting successes in sample; each item can only be selected once
PMF (Probability Mass Function)	P(X = k) = [C(K, k) × C(N-K, n-k)] / C(N, n)
CDF (Cumulative Distribution Function)	P(X ≤ k) = Σ(i=0 to k) [C(K, i) × C(N-K, n-i)] / C(N, n)
Mean	E[X] = n × (K/N)
Variance	Var(X) = n × (K/N) × (1 - K/N) × [(N-n)/(N-1)]
Standard Deviation	σ = √[n × (K/N) × (1 - K/N) × (N-n)/(N-1)]

Use distributions calculator →

Poisson Distribution

⋅ Models the number of events occurring in a fixed interval of time or space.
⋅ Events occur independently.
⋅ Events occur at a constant average rate

lambda

⋅ Random variable X counts the number of events in the interval.

⋅ Events cannot occur simultaneously (no clustering).
⋅ Support:

\{0, 1, 2, \ldots\}

.
⋅ Probability function:

P(X = k) = \dfrac{\lambda^k e^{-\lambda}}{k!}

.
⋅ Parameter:

\lambda > 0

.
⋅ Notation:

X \sim \text{Poisson}(\lambda)

Checklist for Identifying a Poisson Distribution

✔ Events occur independently over time or space.
✔ Events happen at a constant average rate (λ).
✔ The probability of more than one event in an infinitesimal interval is negligible.
✔

X

is defined as the number of events in a fixed interval.

Notations Used:

X \sim \text{Poisson}(\lambda)

X \sim \mathcal{P}(\lambda)

— distribution of the random variable.

Poisson(\lambda)

— used to denote the distribution itself (not the random variable).

P(\lambda)

— sometimes used informally, especially in compact notation.

P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}, \quad \text{for } k = 0, 1, 2, \dots

— probability mass function

See All Probability Symbols and Notations →

Parameters of Poisson Distribution

𝜆

: the average rate (mean number of events), with

𝜆>0

The Poisson distribution models the number of events occurring in a fixed interval of time or space, assuming events happen independently and at a constant average rate

𝜆

.
It describes counts: 0, 1, 2, ..., with probabilities determined by how large or small

𝜆

is.
The single parameter

𝜆

controls both the mean and the variance of the distribution.

The probability mass function (PMF) of a Poisson distribution is given by:

P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}, \quad k = 0, 1, 2, \ldots

Intuition Behind the Formula

* Counting Rare Events: The Poisson distribution models the number of events occurring in a fixed interval of time or space when events occur independently at a constant average rate.

* Parameters:
*

\lambda

: The average rate (mean number of events) in the given interval
*

\lambda > 0

* Support (Range of the Random Variable):
* The random variable

X

can take on values

0, 1, 2, 3, \ldots

(all non-negative integers).
*

X = k

means exactly

k

events occur in the interval.
* The support is thus a countably infinite set.

* Logic Behind the Formula:
*

\lambda^k

: Represents the rate parameter raised to the power of the number of events
*

e^{-\lambda}

: The exponential decay factor ensuring probabilities sum to 1
*

k!

: Accounts for the number of ways

k

events can be ordered
* The total probability sums to 1:

\sum_{k=0}^{\infty} P(X = k) = \sum_{k=0}^{\infty} \frac{\lambda^k e^{-\lambda}}{k!} = e^{-\lambda} \sum_{k=0}^{\infty} \frac{\lambda^k}{k!} = e^{-\lambda} \cdot e^{\lambda} = 1

* This uses the Taylor series expansion:

e^{\lambda} = \sum_{k=0}^{\infty} \frac{\lambda^k}{k!}

Practical Example

Suppose a call center receives an average of

\lambda = 4

calls per hour. The probability of receiving exactly

k = 6

calls in a given hour is:

P(X = 6) = \frac{4^6 e^{-4}}{6!} = \frac{4096 \cdot 0.0183}{720} \approx 0.104

This means there's about a 10.4% chance of receiving exactly 6 calls in an hour.

Note: The Poisson distribution is often used as an approximation to the binomial distribution when

n

is large and

p

is small, with

\lambda = np

Poisson Distribution

Property	Poisson Distribution
Description	Models the number of events occurring in a fixed interval of time or space when events occur independently at a constant average rate (e.g., number of phone calls received per hour)
Support (Domain)	X ∈ {0, 1, 2, 3, ...}
Finite or Infinite?	Infinite
Bounds/Range	[0, ∞)
Parameters	λ (lambda, the average rate of events), where λ > 0
Number of trials known/fixed beforehand?	No fixed number of trials; counts events in a fixed interval
Selection Property/Mechanism	Events occur independently; constant average rate; events in non-overlapping intervals are independent; useful for rare events
PMF (Probability Mass Function)	P(X = k) = (λ^k × e^-λ) / k! for k ∈ {0, 1, 2, ...}
CDF (Cumulative Distribution Function)	P(X ≤ k) = Σ(i=0 to k) (λⁱ × e^-λ) / i! = e^-λ × Σ(i=0 to k) λⁱ / i!
Mean	E[X] = λ
Variance	Var(X) = λ
Standard Deviation	σ = √λ

Use distributions calculator →