Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools

Probability












Introduction to Probability Section

Probability is a field of mathematics that deals with uncertainty and provides tools to measure and analyze how likely events are to occur. It begins with basic concepts such as outcomes, events, and sample spaces, forming the foundation for calculating likelihoods.
Central to probability is the concept of probability measures, which assign values between 0 and 1 to events, indicating their likelihood. A value of 0 means an event is impossible, while 1 signifies certainty. Key principles include independence (events that do not influence each other) and conditional probability, which considers the likelihood of an event given that another has occurred.
Probability also introduces random variables, which assign numerical values to outcomes. These variables are categorized as either discrete (taking specific values, like rolling a dice) or continuous(taking any value within a range, like measuring temperature). Important measures such as expectancy(average value) and variance(spread or variability) are used to summarize the behavior of random variables.
Advanced topics include distributions, such as the binomial, normal, and Poisson distributions, which model specific types of random phenomena. These tools are essential for understanding patterns in random processes and making informed predictions.
Probability is widely applied in science, engineering, finance, and everyday decision-making. It forms the basis for statistics, enabling data-driven insights and predictions, and supports fields like machine learning, risk analysis, and quantum mechanics. By studying probability, students develop skills to reason about uncertainty and draw conclusions from incomplete information.

Probability Formulas

This page presents essential probability formulas organized by categories, ranging from basic principles to advanced distributions. Each formula includes detailed explanations, example calculations, and practical use cases, making it a helpful resource for students and practitioners working with probability theory and statistical analysis.

Simple Probability

P(A)=Number of favorable outcomesTotal number of possible outcomesP(A) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}

Probability Range of an Event

0P(A)1 0 \leq P(A) \leq 1

Complement Rule

P(A)+P(A)=1P(A') + P(A) = 1

Conditional Probability Basic Formula

P(AB)=P(AB)P(B)P(A \mid B) = \frac{P(A \cap B)}{P(B)}

Bayes' Theorem

P(AB)=P(BA)×P(A)P(B)P(A \mid B) = \frac{P(B \mid A) \times P(A)}{P(B)}

Probability of Both Events Occurring (Multiplication Rule)

P(AB)=P(A)×P(B)P(A \cap B) = P(A) \times P(B)

Probability of Either Event Occurring (Addition Rule)

P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)

Probability of At Least One Event Not Occurring

P(¬A¬B)=P(¬A)×P(¬B)P(\neg A \cap \neg B) = P(\neg A) \times P(\neg B)

Probability of Exactly One Event Occurring

P(exactly one of A or B)=P(A¬B)+P(¬AB)P(\text{exactly one of } A \text{ or } B) = P(A \cap \neg B) + P(\neg A \cap B)

General Formula for Multiple Independent Events

P(ABC)=P(A)×P(B)×P(C)P(A \cap B \cap C) = P(A) \times P(B) \times P(C)

Probability of Both Disjoint Events Occurring

P(AB)=0P(A \cap B) = 0

Probability of Either Disjoint Event Occurring (Addition Rule)

P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B)

Probability of Neither Disjoint Event Occurring

P(¬A¬B)=P(¬A)×P(¬B)P(\neg A \cap \neg B) = P(\neg A) \times P(\neg B)

Conditional Probability for Disjoint Events

P(AB)=0andP(BA)=0P(A \mid B) = 0 and P(B \mid A) = 0

Generalization to Multiple Disjoint Events

P(ABC)=P(A)+P(B)+P(C)+P(A \cup B \cup C \cup \ldots) = P(A) + P(B) + P(C) + \ldots

Probability Mass Function (PMF)

P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^{k} (1 - p)^{n - k}

Cumulative Distribution Function (CDF)

P(Xk)=i=0k(ni)pi(1p)niP(X \leq k) = \sum_{i=0}^{k} \binom{n}{i} p^{i} (1 - p)^{n - i}

Mean (Expected Value)

μ=E[X]=np\mu = E[X] = n p

Variance

σ2=Var(X)=np(1p)\sigma^2 = \operatorname{Var}(X) = n p (1 - p)

Standard Deviation

σ=np(1p)\sigma = \sqrt{n p (1 - p)}

Probability Mass Function (PMF)

P(X=k)=eλλkk!P(X = k) = \frac{e^{-\lambda} \lambda^{k}}{k!}

Cumulative Distribution Function (CDF)

P(Xk)=eλi=0kλii!P(X \leq k) = e^{-\lambda} \sum_{i=0}^{k} \frac{\lambda^{i}}{i!}

Mean (Expected Value)

μ=E[X]=λ\mu = E[X] = \lambda

Variance

σ2=Var(X)=λ\sigma^2 = \operatorname{Var}(X) = \lambda

Standard Deviation

σ=λ\sigma = \sqrt{\lambda}

Probability Mass Function (PMF)

P(X=k)=(1p)k1pP(X = k) = (1 - p)^{k - 1} p

Cumulative Distribution Function (CDF)

P(Xk)=1(1p)kP(X \leq k) = 1 - (1 - p)^{k}

Mean (Expected Value)

μ=E[X]=1p\mu = E[X] = \frac{1}{p}

Variance

σ2=Var(X)=1pp2\sigma^2 = \operatorname{Var}(X) = \frac{1 - p}{p^{2}}

Standard Deviation

σ=1pp2\sigma = \sqrt{\frac{1 - p}{p^{2}}}

Probability Mass Function (PMF)

P(X=k)=(k1r1)pr(1p)krP(X = k) = \binom{k - 1}{r - 1} p^{r} (1 - p)^{k - r}

Cumulative Distribution Function (CDF)

P(Xk)=Ip(r,kr+1)P(X \leq k) = I_{p}(r, k - r + 1)

Mean (Expected Value)

μ=E[X]=rp\mu = E[X] = \frac{r}{p}

Variance

σ2=Var(X)=r(1p)p2\sigma^2 = \operatorname{Var}(X) = \frac{r (1 - p)}{p^{2}}

Standard Deviation

σ=r(1p)p2\sigma = \sqrt{\frac{r (1 - p)}{p^{2}}}

Probability Mass Function (PMF)

P(X=k)=(Kk)(NKnk)(Nn)P(X = k) = \frac{\binom{K}{k} \binom{N - K}{n - k}}{\binom{N}{n}}

Mean (Expected Value)

μ=E[X]=n(KN)\mu = E[X] = n \left( \frac{K}{N} \right)

Variance

σ2=Var(X)=n(KN)(NKN)(NnN1)\sigma^2 = \operatorname{Var}(X) = n \left( \frac{K}{N} \right) \left( \frac{N - K}{N} \right) \left( \frac{N - n}{N - 1} \right)

Standard Deviation

σ=n(KN)(NKN)(NnN1)\sigma = \sqrt{n \left( \frac{K}{N} \right) \left( \frac{N - K}{N} \right) \left( \frac{N - n}{N - 1} \right)}

Probability of At Least $k$ Successes

P(Xk)=1i=0k1P(X=i)P(X \geq k) = 1 - \sum_{i=0}^{k - 1} P(X = i)

Probability Mass Function (PMF)

P(X1=x1,,Xk=xk)=n!x1!x2!xk!p1x1p2x2pkxkP(X_1 = x_1, \dots, X_k = x_k) = \frac{n!}{x_1! x_2! \dots x_k!} p_1^{x_1} p_2^{x_2} \dots p_k^{x_k}

Mean (Expected Value) of Each Outcome

E[Xi]=npiE[X_i] = n p_i

Variance of Each Outcome

Var(Xi)=npi(1pi)\operatorname{Var}(X_i) = n p_i (1 - p_i)

Covariance Between Outcomes

Cov(Xi,Xj)=npipj\operatorname{Cov}(X_i, X_j) = -n p_i p_j

Correlation Coefficient Between Outcomes

ρij=Cov(Xi,Xj)Var(Xi)Var(Xj)=pipjpi(1pi)pj(1pj)\rho_{ij} = \frac{\operatorname{Cov}(X_i, X_j)}{\sqrt{\operatorname{Var}(X_i) \operatorname{Var}(X_j)}} = \frac{-p_i p_j}{\sqrt{p_i (1 - p_i) p_j (1 - p_j)}}

Probability Mass Function (PMF)

P(X=k)=1ba+1P(X = k) = \frac{1}{b - a + 1}

Mean (Expected Value)

μ=E[X]=a+b2\mu = E[X] = \frac{a + b}{2}

Variance

σ2=Var(X)=(ba+1)2112\sigma^2 = \operatorname{Var}(X) = \frac{(b - a + 1)^2 - 1}{12}

Standard Deviation

σ=(ba+1)2112\sigma = \sqrt{\frac{(b - a + 1)^2 - 1}{12}}

Cumulative Distribution Function (CDF)

P(Xk)=ka+1ba+1fork=a,a+1,,bP(X \leq k) = \frac{k - a + 1}{b - a + 1} for k = a, a+1, \dots, b

Probability Mass Function (PMF)

P(X=k)=(k1r1)(NkKr)(NK)P(X = k) = \frac{\binom{k - 1}{r - 1} \binom{N - k}{K - r}}{\binom{N}{K}}

Mean (Expected Value)

μ=E[X]=r(N+1)K+1\mu = E[X] = \frac{r(N + 1)}{K + 1}

Variance

σ2=r(N+1)(NK)(Nr)(K+1)2(K+2)\sigma^2 = \frac{r (N + 1)(N - K)(N - r)}{(K + 1)^{2} (K + 2)}

Standard Deviation

σ=r(N+1)(NK)(Nr)(K+1)2(K+2)\sigma = \sqrt{\frac{r (N + 1)(N - K)(N - r)}{(K + 1)^{2} (K + 2)}}

Cumulative Distribution Function (CDF)

P(Xk)=1(Nrkr)(r1r1)(Nk)P(X \leq k) = 1 - \frac{\binom{N - r}{k - r} \binom{r - 1}{r - 1}}{\binom{N}{k}}

Probability Mass Function (PMF)

P(X=k)=1ln(1p)pkkP(X = k) = -\frac{1}{\ln(1 - p)} \frac{p^{k}}{k}

Mean (Expected Value)

μ=E[X]=p(1p)ln(1p)\mu = E[X] = \frac{-p}{(1 - p) \ln(1 - p)}

Variance

σ2=Var(X)=p(p+ln(1p))(1p)2[ln(1p)]2\sigma^2 = \operatorname{Var}(X) = \frac{-p (p + \ln(1 - p))}{(1 - p)^{2} [\ln(1 - p)]^{2}}

Standard Deviation

σ=Var(X)\sigma = \sqrt{\operatorname{Var}(X)}

Generating Function

GX(s)=ln(1ps)ln(1p)G_X(s) = \frac{\ln(1 - p s)}{\ln(1 - p)}

Simple Probability

P(A)=Number of favorable outcomesTotal number of possible outcomesP(A) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}

Probability Range of an Event

0P(A)1 0 \leq P(A) \leq 1

Complement Rule

P(A)+P(A)=1P(A') + P(A) = 1

Conditional Probability Basic Formula

P(AB)=P(AB)P(B)P(A \mid B) = \frac{P(A \cap B)}{P(B)}

Bayes' Theorem

P(AB)=P(BA)×P(A)P(B)P(A \mid B) = \frac{P(B \mid A) \times P(A)}{P(B)}

Probability of Both Events Occurring (Multiplication Rule)

P(AB)=P(A)×P(B)P(A \cap B) = P(A) \times P(B)

Probability of Either Event Occurring (Addition Rule)

P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)

Probability of At Least One Event Not Occurring

P(¬A¬B)=P(¬A)×P(¬B)P(\neg A \cap \neg B) = P(\neg A) \times P(\neg B)

Probability of Exactly One Event Occurring

P(exactly one of A or B)=P(A¬B)+P(¬AB)P(\text{exactly one of } A \text{ or } B) = P(A \cap \neg B) + P(\neg A \cap B)

General Formula for Multiple Independent Events

P(ABC)=P(A)×P(B)×P(C)P(A \cap B \cap C) = P(A) \times P(B) \times P(C)

Probability of Both Disjoint Events Occurring

P(AB)=0P(A \cap B) = 0

Probability of Either Disjoint Event Occurring (Addition Rule)

P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B)

Probability of Neither Disjoint Event Occurring

P(¬A¬B)=P(¬A)×P(¬B)P(\neg A \cap \neg B) = P(\neg A) \times P(\neg B)

Conditional Probability for Disjoint Events

P(AB)=0andP(BA)=0P(A \mid B) = 0 and P(B \mid A) = 0

Generalization to Multiple Disjoint Events

P(ABC)=P(A)+P(B)+P(C)+P(A \cup B \cup C \cup \ldots) = P(A) + P(B) + P(C) + \ldots

Probability Mass Function (PMF)

P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^{k} (1 - p)^{n - k}

Cumulative Distribution Function (CDF)

P(Xk)=i=0k(ni)pi(1p)niP(X \leq k) = \sum_{i=0}^{k} \binom{n}{i} p^{i} (1 - p)^{n - i}

Mean (Expected Value)

μ=E[X]=np\mu = E[X] = n p

Variance

σ2=Var(X)=np(1p)\sigma^2 = \operatorname{Var}(X) = n p (1 - p)

Standard Deviation

σ=np(1p)\sigma = \sqrt{n p (1 - p)}

Probability Mass Function (PMF)

P(X=k)=eλλkk!P(X = k) = \frac{e^{-\lambda} \lambda^{k}}{k!}

Cumulative Distribution Function (CDF)

P(Xk)=eλi=0kλii!P(X \leq k) = e^{-\lambda} \sum_{i=0}^{k} \frac{\lambda^{i}}{i!}

Mean (Expected Value)

μ=E[X]=λ\mu = E[X] = \lambda

Variance

σ2=Var(X)=λ\sigma^2 = \operatorname{Var}(X) = \lambda

Standard Deviation

σ=λ\sigma = \sqrt{\lambda}

Probability Mass Function (PMF)

P(X=k)=(1p)k1pP(X = k) = (1 - p)^{k - 1} p

Cumulative Distribution Function (CDF)

P(Xk)=1(1p)kP(X \leq k) = 1 - (1 - p)^{k}

Mean (Expected Value)

μ=E[X]=1p\mu = E[X] = \frac{1}{p}

Variance

σ2=Var(X)=1pp2\sigma^2 = \operatorname{Var}(X) = \frac{1 - p}{p^{2}}

Standard Deviation

σ=1pp2\sigma = \sqrt{\frac{1 - p}{p^{2}}}

Probability Mass Function (PMF)

P(X=k)=(k1r1)pr(1p)krP(X = k) = \binom{k - 1}{r - 1} p^{r} (1 - p)^{k - r}

Cumulative Distribution Function (CDF)

P(Xk)=Ip(r,kr+1)P(X \leq k) = I_{p}(r, k - r + 1)

Mean (Expected Value)

μ=E[X]=rp\mu = E[X] = \frac{r}{p}

Variance

σ2=Var(X)=r(1p)p2\sigma^2 = \operatorname{Var}(X) = \frac{r (1 - p)}{p^{2}}

Standard Deviation

σ=r(1p)p2\sigma = \sqrt{\frac{r (1 - p)}{p^{2}}}

Probability Mass Function (PMF)

P(X=k)=(Kk)(NKnk)(Nn)P(X = k) = \frac{\binom{K}{k} \binom{N - K}{n - k}}{\binom{N}{n}}

Mean (Expected Value)

μ=E[X]=n(KN)\mu = E[X] = n \left( \frac{K}{N} \right)

Variance

σ2=Var(X)=n(KN)(NKN)(NnN1)\sigma^2 = \operatorname{Var}(X) = n \left( \frac{K}{N} \right) \left( \frac{N - K}{N} \right) \left( \frac{N - n}{N - 1} \right)

Standard Deviation

σ=n(KN)(NKN)(NnN1)\sigma = \sqrt{n \left( \frac{K}{N} \right) \left( \frac{N - K}{N} \right) \left( \frac{N - n}{N - 1} \right)}

Probability of At Least $k$ Successes

P(Xk)=1i=0k1P(X=i)P(X \geq k) = 1 - \sum_{i=0}^{k - 1} P(X = i)

Probability Mass Function (PMF)

P(X1=x1,,Xk=xk)=n!x1!x2!xk!p1x1p2x2pkxkP(X_1 = x_1, \dots, X_k = x_k) = \frac{n!}{x_1! x_2! \dots x_k!} p_1^{x_1} p_2^{x_2} \dots p_k^{x_k}

Mean (Expected Value) of Each Outcome

E[Xi]=npiE[X_i] = n p_i

Variance of Each Outcome

Var(Xi)=npi(1pi)\operatorname{Var}(X_i) = n p_i (1 - p_i)

Covariance Between Outcomes

Cov(Xi,Xj)=npipj\operatorname{Cov}(X_i, X_j) = -n p_i p_j

Correlation Coefficient Between Outcomes

ρij=Cov(Xi,Xj)Var(Xi)Var(Xj)=pipjpi(1pi)pj(1pj)\rho_{ij} = \frac{\operatorname{Cov}(X_i, X_j)}{\sqrt{\operatorname{Var}(X_i) \operatorname{Var}(X_j)}} = \frac{-p_i p_j}{\sqrt{p_i (1 - p_i) p_j (1 - p_j)}}

Probability Mass Function (PMF)

P(X=k)=1ba+1P(X = k) = \frac{1}{b - a + 1}

Mean (Expected Value)

μ=E[X]=a+b2\mu = E[X] = \frac{a + b}{2}

Variance

σ2=Var(X)=(ba+1)2112\sigma^2 = \operatorname{Var}(X) = \frac{(b - a + 1)^2 - 1}{12}

Standard Deviation

σ=(ba+1)2112\sigma = \sqrt{\frac{(b - a + 1)^2 - 1}{12}}

Cumulative Distribution Function (CDF)

P(Xk)=ka+1ba+1fork=a,a+1,,bP(X \leq k) = \frac{k - a + 1}{b - a + 1} for k = a, a+1, \dots, b

Probability Mass Function (PMF)

P(X=k)=(k1r1)(NkKr)(NK)P(X = k) = \frac{\binom{k - 1}{r - 1} \binom{N - k}{K - r}}{\binom{N}{K}}

Mean (Expected Value)

μ=E[X]=r(N+1)K+1\mu = E[X] = \frac{r(N + 1)}{K + 1}

Variance

σ2=r(N+1)(NK)(Nr)(K+1)2(K+2)\sigma^2 = \frac{r (N + 1)(N - K)(N - r)}{(K + 1)^{2} (K + 2)}

Standard Deviation

σ=r(N+1)(NK)(Nr)(K+1)2(K+2)\sigma = \sqrt{\frac{r (N + 1)(N - K)(N - r)}{(K + 1)^{2} (K + 2)}}

Cumulative Distribution Function (CDF)

P(Xk)=1(Nrkr)(r1r1)(Nk)P(X \leq k) = 1 - \frac{\binom{N - r}{k - r} \binom{r - 1}{r - 1}}{\binom{N}{k}}

Probability Mass Function (PMF)

P(X=k)=1ln(1p)pkkP(X = k) = -\frac{1}{\ln(1 - p)} \frac{p^{k}}{k}

Mean (Expected Value)

μ=E[X]=p(1p)ln(1p)\mu = E[X] = \frac{-p}{(1 - p) \ln(1 - p)}

Variance

σ2=Var(X)=p(p+ln(1p))(1p)2[ln(1p)]2\sigma^2 = \operatorname{Var}(X) = \frac{-p (p + \ln(1 - p))}{(1 - p)^{2} [\ln(1 - p)]^{2}}

Standard Deviation

σ=Var(X)\sigma = \sqrt{\operatorname{Var}(X)}

Generating Function

GX(s)=ln(1ps)ln(1p)G_X(s) = \frac{\ln(1 - p s)}{\ln(1 - p)}
Learn More

Probability Terms and Definitions

Negative Binomial Distribution

A discrete distribution counting the number of trials needed to achieve a fixed number $r$ of successes in a sequence of independent Bernoulli trials with success probability $p$.

Probability

A function $P$ that assigns to each event $A$ in a sample space a real number $P(A) \in [0, 1]$ satisfying the probability axioms.

Random Experiment

A process or action whose outcome cannot be predicted with certainty before it is performed.

Sample Space

$\Omega = \{\omega_1, \omega_2, \ldots\}$ — the set of all possible outcomes of a random experiment.

Event

$A \subseteq \Omega$ — a subset of the sample space.

Elementary Event

An event consisting of exactly one outcome: $\{\omega\}$ where $\omega \in \Omega$.

Relative Frequency

$f_n(A) = \frac{\text{number of times } A \text{ occurs}}{n}$ where $n$ is the total number of trials.

Probability Measure

A function $P: \mathcal{F} \to [0,1]$ defined on a collection of events, satisfying non-negativity, normalization ($P(\Omega) = 1$), and countable additivity for disjoint events.

Equally Likely Events

Events $A_1, A_2, \ldots, A_n$ are equally likely when $P(A_1) = P(A_2) = \cdots = P(A_n)$.

Conditional Probability

$P(A \mid B) = \frac{P(A \cap B)}{P(B)}$, defined when $P(B) > 0$.

Independent Events

Events $A$ and $B$ are independent if and only if $P(A \cap B) = P(A) \cdot P(B)$.

Mutual Exclusiveness

Events $A$ and $B$ are mutually exclusive if $A \cap B = \emptyset$.

Bernoulli Experiment

A random experiment with exactly two possible outcomes, conventionally called success ($S$) and failure ($F$).

Sequence of Bernoulli Trials

A sequence of independent Bernoulli experiments, each with the same success probability $p$.

Random Variable

$X: \Omega \to \mathbb{R}$ — a function that assigns a real number to each outcome in the sample space.

Discrete Random Variable

A random variable whose set of possible values is finite or countably infinite.

Continuous Random Variable

A random variable whose set of possible values forms an interval or union of intervals on the real line.

Cumulative Distribution Function

$F_X(x) = P(X \le x)$ for all $x \in \mathbb{R}$.

Probability Mass Function

$p_X(x) = P(X = x)$ — the probability that a discrete random variable $X$ takes the value $x$.

Probability Density Function

A function $f_X(x) \ge 0$ such that $P(a \le X \le b) = \int_a^b f_X(x)\,dx$ and $\int_{-\infty}^{\infty} f_X(x)\,dx = 1$.

Expected Value

$E[X] = \sum_x x \cdot p_X(x)$ (discrete) or $E[X] = \int_{-\infty}^{\infty} x \cdot f_X(x)\,dx$ (continuous).

Variance

$\operatorname{Var}(X) = E[(X - \mu)^2]$ where $\mu = E[X]$.

Standard Deviation

$\sigma_X = \sqrt{\operatorname{Var}(X)}$

Covariance

$\operatorname{Cov}(X, Y) = E[(X - E[X])(Y - E[Y])]$

Correlation Coefficient

$\rho_{XY} = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \cdot \sigma_Y}$, where $\sigma_X, \sigma_Y > 0$.

Conditional Expectation

$E[X \mid Y = y] = \sum_x x \cdot P(X = x \mid Y = y)$ (discrete) or $E[X \mid Y = y] = \int x \cdot f_{X|Y}(x \mid y)\,dx$ (continuous).

Conditional Variance

$\operatorname{Var}(X \mid Y = y) = E[(X - E[X \mid Y = y])^2 \mid Y = y]$

Moment of a Random Variable

The $k$-th moment of $X$ about the origin is $E[X^k]$. The $k$-th central moment is $E[(X - \mu)^k]$.

Bernoulli Distribution

A discrete distribution for a single trial with two outcomes: $P(X = 1) = p$ and $P(X = 0) = 1 - p$.

Binomial Distribution

A discrete distribution counting the number of successes in $n$ independent Bernoulli trials, each with success probability $p$.

Poisson Distribution

A discrete distribution modelling the number of events occurring in a fixed interval, where events happen independently at a constant average rate $\lambda$.

Discrete Uniform Distribution

A discrete distribution where each of $n$ possible values has equal probability $1/n$.

Exponential Distribution

A continuous distribution describing the time between events in a process where events occur independently at a constant rate $\lambda$.

Normal Distribution

A continuous distribution characterized by a symmetric, bell-shaped curve, fully determined by its mean $\mu$ and variance $\sigma^2$.

Geometric Distribution

A discrete distribution counting the number of trials needed to obtain the first success in a sequence of independent Bernoulli trials with success probability $p$.

Hypergeometric Distribution

A discrete distribution describing the number of successes in $n$ draws without replacement from a finite population containing $K$ successes and $N - K$ failures.

Bivariate Random Variable

A pair of random variables $(X, Y)$ defined on the same sample space, considered jointly.

N-Variate Random Variables

A vector $(X_1, X_2, \ldots, X_n)$ of $n$ random variables defined on the same sample space.

Independent Random Variables

Random variables $X$ and $Y$ are independent if $P(X \le x, Y \le y) = P(X \le x) \cdot P(Y \le y)$ for all $x, y$.

Orthogonal Random Variables

Random variables $X$ and $Y$ are orthogonal if $E[XY] = 0$.

Uncorrelated Random Variables

Random variables $X$ and $Y$ are uncorrelated if $\operatorname{Cov}(X, Y) = 0$, equivalently $E[XY] = E[X]E[Y]$.

Marginal Distribution

The distribution of one random variable obtained from a joint distribution by summing (discrete) or integrating (continuous) over all values of the other variable(s).

Joint Cumulative Distribution Function

$F_{X,Y}(x, y) = P(X \le x, Y \le y)$ for all $x, y \in \mathbb{R}$.

Joint Probability Mass Function

$p_{X,Y}(x, y) = P(X = x, Y = y)$ for discrete random variables $X$ and $Y$.

Joint Probability Density Function

A function $f_{X,Y}(x,y) \ge 0$ such that $P((X,Y) \in A) = \iint_A f_{X,Y}(x,y)\,dx\,dy$ for any region $A$.

Conditional Probability Mass Function

$p_{X|Y}(x \mid y) = \frac{p_{X,Y}(x, y)}{p_Y(y)}$, defined when $p_Y(y) > 0$.

Conditional Probability Density Function

$f_{X|Y}(x \mid y) = \frac{f_{X,Y}(x, y)}{f_Y(y)}$, defined when $f_Y(y) > 0$.

Function of a Random Variable

Given a random variable $X$ and a function $g$, $Y = g(X)$ defines a new random variable whose distribution is determined by the distribution of $X$ and the function $g$.

PDF of a Transformed Variable

If $Y = g(X)$ where $g$ is monotone and differentiable, then $f_Y(y) = f_X(g^{-1}(y)) \cdot |\frac{d}{dy} g^{-1}(y)|$.

Moment Generating Function

$M_X(t) = E[e^{tX}]$, defined for real values of $t$ where the expectation exists.

Venn Diagram

A graphical representation using overlapping circles to depict sets (events) and their relationships within a sample space.

Null Set

$\emptyset$ — the set containing no elements, representing an impossible event in probability.

Union of Sets

$A \cup B = \{\omega : \omega \in A \text{ or } \omega \in B\}$ — the event that at least one of $A$ or $B$ occurs.

Intersection of Sets

$A \cap B = \{\omega : \omega \in A \text{ and } \omega \in B\}$ — the event that both $A$ and $B$ occur simultaneously.

Disjoint Sets

Sets $A$ and $B$ are disjoint if $A \cap B = \emptyset$ — they share no common elements.

Complement of a Set

$A^c = \{\omega \in \Omega : \omega \notin A\}$ — all outcomes in the sample space that are not in $A$.

Probability Tree

A branching diagram where each node represents a stage of a sequential random process, branches represent possible outcomes, and branch labels show conditional probabilities.

Negative Binomial Distribution

A discrete distribution counting the number of trials needed to achieve a fixed number $r$ of successes in a sequence of independent Bernoulli trials with success probability $p$.

Probability

A function $P$ that assigns to each event $A$ in a sample space a real number $P(A) \in [0, 1]$ satisfying the probability axioms.

Random Experiment

A process or action whose outcome cannot be predicted with certainty before it is performed.

Sample Space

$\Omega = \{\omega_1, \omega_2, \ldots\}$ — the set of all possible outcomes of a random experiment.

Event

$A \subseteq \Omega$ — a subset of the sample space.

Elementary Event

An event consisting of exactly one outcome: $\{\omega\}$ where $\omega \in \Omega$.

Relative Frequency

$f_n(A) = \frac{\text{number of times } A \text{ occurs}}{n}$ where $n$ is the total number of trials.

Probability Measure

A function $P: \mathcal{F} \to [0,1]$ defined on a collection of events, satisfying non-negativity, normalization ($P(\Omega) = 1$), and countable additivity for disjoint events.

Equally Likely Events

Events $A_1, A_2, \ldots, A_n$ are equally likely when $P(A_1) = P(A_2) = \cdots = P(A_n)$.

Conditional Probability

$P(A \mid B) = \frac{P(A \cap B)}{P(B)}$, defined when $P(B) > 0$.

Independent Events

Events $A$ and $B$ are independent if and only if $P(A \cap B) = P(A) \cdot P(B)$.

Mutual Exclusiveness

Events $A$ and $B$ are mutually exclusive if $A \cap B = \emptyset$.

Bernoulli Experiment

A random experiment with exactly two possible outcomes, conventionally called success ($S$) and failure ($F$).

Sequence of Bernoulli Trials

A sequence of independent Bernoulli experiments, each with the same success probability $p$.

Random Variable

$X: \Omega \to \mathbb{R}$ — a function that assigns a real number to each outcome in the sample space.

Discrete Random Variable

A random variable whose set of possible values is finite or countably infinite.

Continuous Random Variable

A random variable whose set of possible values forms an interval or union of intervals on the real line.

Cumulative Distribution Function

$F_X(x) = P(X \le x)$ for all $x \in \mathbb{R}$.

Probability Mass Function

$p_X(x) = P(X = x)$ — the probability that a discrete random variable $X$ takes the value $x$.

Probability Density Function

A function $f_X(x) \ge 0$ such that $P(a \le X \le b) = \int_a^b f_X(x)\,dx$ and $\int_{-\infty}^{\infty} f_X(x)\,dx = 1$.

Expected Value

$E[X] = \sum_x x \cdot p_X(x)$ (discrete) or $E[X] = \int_{-\infty}^{\infty} x \cdot f_X(x)\,dx$ (continuous).

Variance

$\operatorname{Var}(X) = E[(X - \mu)^2]$ where $\mu = E[X]$.

Standard Deviation

$\sigma_X = \sqrt{\operatorname{Var}(X)}$

Covariance

$\operatorname{Cov}(X, Y) = E[(X - E[X])(Y - E[Y])]$

Correlation Coefficient

$\rho_{XY} = \frac{\operatorname{Cov}(X, Y)}{\sigma_X \cdot \sigma_Y}$, where $\sigma_X, \sigma_Y > 0$.

Conditional Expectation

$E[X \mid Y = y] = \sum_x x \cdot P(X = x \mid Y = y)$ (discrete) or $E[X \mid Y = y] = \int x \cdot f_{X|Y}(x \mid y)\,dx$ (continuous).

Conditional Variance

$\operatorname{Var}(X \mid Y = y) = E[(X - E[X \mid Y = y])^2 \mid Y = y]$

Moment of a Random Variable

The $k$-th moment of $X$ about the origin is $E[X^k]$. The $k$-th central moment is $E[(X - \mu)^k]$.

Bernoulli Distribution

A discrete distribution for a single trial with two outcomes: $P(X = 1) = p$ and $P(X = 0) = 1 - p$.

Binomial Distribution

A discrete distribution counting the number of successes in $n$ independent Bernoulli trials, each with success probability $p$.

Poisson Distribution

A discrete distribution modelling the number of events occurring in a fixed interval, where events happen independently at a constant average rate $\lambda$.

Discrete Uniform Distribution

A discrete distribution where each of $n$ possible values has equal probability $1/n$.

Exponential Distribution

A continuous distribution describing the time between events in a process where events occur independently at a constant rate $\lambda$.

Normal Distribution

A continuous distribution characterized by a symmetric, bell-shaped curve, fully determined by its mean $\mu$ and variance $\sigma^2$.

Geometric Distribution

A discrete distribution counting the number of trials needed to obtain the first success in a sequence of independent Bernoulli trials with success probability $p$.

Hypergeometric Distribution

A discrete distribution describing the number of successes in $n$ draws without replacement from a finite population containing $K$ successes and $N - K$ failures.

Bivariate Random Variable

A pair of random variables $(X, Y)$ defined on the same sample space, considered jointly.

N-Variate Random Variables

A vector $(X_1, X_2, \ldots, X_n)$ of $n$ random variables defined on the same sample space.

Independent Random Variables

Random variables $X$ and $Y$ are independent if $P(X \le x, Y \le y) = P(X \le x) \cdot P(Y \le y)$ for all $x, y$.

Orthogonal Random Variables

Random variables $X$ and $Y$ are orthogonal if $E[XY] = 0$.

Uncorrelated Random Variables

Random variables $X$ and $Y$ are uncorrelated if $\operatorname{Cov}(X, Y) = 0$, equivalently $E[XY] = E[X]E[Y]$.

Marginal Distribution

The distribution of one random variable obtained from a joint distribution by summing (discrete) or integrating (continuous) over all values of the other variable(s).

Joint Cumulative Distribution Function

$F_{X,Y}(x, y) = P(X \le x, Y \le y)$ for all $x, y \in \mathbb{R}$.

Joint Probability Mass Function

$p_{X,Y}(x, y) = P(X = x, Y = y)$ for discrete random variables $X$ and $Y$.

Joint Probability Density Function

A function $f_{X,Y}(x,y) \ge 0$ such that $P((X,Y) \in A) = \iint_A f_{X,Y}(x,y)\,dx\,dy$ for any region $A$.

Conditional Probability Mass Function

$p_{X|Y}(x \mid y) = \frac{p_{X,Y}(x, y)}{p_Y(y)}$, defined when $p_Y(y) > 0$.

Conditional Probability Density Function

$f_{X|Y}(x \mid y) = \frac{f_{X,Y}(x, y)}{f_Y(y)}$, defined when $f_Y(y) > 0$.

Function of a Random Variable

Given a random variable $X$ and a function $g$, $Y = g(X)$ defines a new random variable whose distribution is determined by the distribution of $X$ and the function $g$.

PDF of a Transformed Variable

If $Y = g(X)$ where $g$ is monotone and differentiable, then $f_Y(y) = f_X(g^{-1}(y)) \cdot |\frac{d}{dy} g^{-1}(y)|$.

Moment Generating Function

$M_X(t) = E[e^{tX}]$, defined for real values of $t$ where the expectation exists.

Venn Diagram

A graphical representation using overlapping circles to depict sets (events) and their relationships within a sample space.

Null Set

$\emptyset$ — the set containing no elements, representing an impossible event in probability.

Union of Sets

$A \cup B = \{\omega : \omega \in A \text{ or } \omega \in B\}$ — the event that at least one of $A$ or $B$ occurs.

Intersection of Sets

$A \cap B = \{\omega : \omega \in A \text{ and } \omega \in B\}$ — the event that both $A$ and $B$ occur simultaneously.

Disjoint Sets

Sets $A$ and $B$ are disjoint if $A \cap B = \emptyset$ — they share no common elements.

Complement of a Set

$A^c = \{\omega \in \Omega : \omega \notin A\}$ — all outcomes in the sample space that are not in $A$.

Probability Tree

A branching diagram where each node represents a stage of a sequential random process, branches represent possible outcomes, and branch labels show conditional probabilities.
Browse Probability terminology including main concepts and their definitions with examples .A structured guide to probability theory terms and concepts, progressing from foundational definitions through set theory, random variables, and complex distributions. The content covers both theoretical aspects and practical applications, making probability concepts more accessible for study and reference.
Learn More

Main Concepts

The Sample Space (Ω), is the collection of all different results that the experiment may produce.

The sample space can be finite (for example, in the dice‐rolling scenario or coin flipping) or infinite (for instance, selecting a real number).
In addition, we may divide sample spaces by outcome types into discrete or continuous.
Often times, defining or calculating proper sample space for any given case may pose sertious challenge and demands experience and certain analytic skills.
Although in the case of a dice roll the collection of possible outcomes may seem self‐evident, the sample space plays an important role in conducting more complex experiments. Typically, a researcher will take the sample space and partition it into subsets in order to draw various conclusions.
In any practical application, accurately defining the sample space is essential to solving probability problems.

In probability theory, the objects to which probabilities are assigned are called events.
By definition, an event is any subset of the sample space. This includes single outcomes, collections of outcomes, the empty set, and the entire sample space itself.
Probability Event is simply any subset of the sample space.

Example:
In case of dice roll the sample space would be S={1,2,3,4,5,6}S = \{1, 2, 3, 4, 5, 6\}
Some possible events:
Event A={2,4,6}A = \{2, 4, 6\} (rolling an even number)
Event B={5}B = \{5\} (rolling exactly 5)
Event C={1,2,3,4,5,6}C = \{1, 2, 3, 4, 5, 6\} (any outcome - certain event)
Event D={}D = \{\} (impossible event)
As the definition states and the example shows, probability event may include one or more outcomes.It is a set of results counting as one event.
Probability is a function that assigns to each event in the sample space a real number in [0,1][0,1] where total probability value of the entire sample space 𝑃(𝑆)=1𝑃(𝑆)=1.

This number is calculated as a ratio P(E)=Number of favorable outcomes for event ETotal number of possible outcomes in the sample space SP(E) = \frac{\text{Number of favorable outcomes for event E}}{\text{Total number of possible outcomes in the sample space S}}

Probability function satisfies three basic axioms of probability.

Set Theory & Event Algebra

When we conceptualize probability, we naturally think of sample spaces as collections of individual outcomes—dots scattered across our mathematical landscape.
Sample Space Ω (Sample Space) Each dot represents an elementary event (outcome) Ω = {ω₁, ω₂, ω₃, ..., ωₙ} where each ωᵢ is an elementary event The sample space contains all possible outcomes of an experiment
However, this intuitive picture presents a fundamental challenge: if we treat each outcome as a geometric point, it has zero area by definition.
Consider the classical probability formula:
P(E)=Number of favorable outcomes for event ETotal number of possible outcomes in the sample space SP(E) = \frac{\text{Number of favorable outcomes for event E}}{\text{Total number of possible outcomes in the sample space S}}.
If we literally counted individual points (dots), each with zero "probability mass," we'd face the paradox that every single outcome has probability zero, yet their sum must equal one.
The Zero Probability Paradox Why Individual Outcomes Have Zero Probability Ω (Sample Space) Single outcome ωᵢ Area = 0 Classical Probability Formula: P(A) = |A| / |Ω| For a single outcome ωᵢ: P({ωᵢ}) = 1 / |Ω| But if each dot has zero area: P({ωᵢ}) = 0 / ∞ = 0 The Paradox: • Each individual outcome has probability 0 • Yet the sum of all individual probabilities must equal 1 • This contradiction shows we cannot work with individual points! Solution: Work with Sets of Outcomes Event A (Set of outcomes) P(A) = |A| / |Ω| > 0 Now we get meaningful probabilities!

This apparent contradiction reveals why probability theory fundamentally operates with sets rather than isolated points. We don't manipulate individual outcomes; instead, we work with collections or groups of outcomes. An event isn't a single dot—it's a set of possible outcomes that satisfy our condition of interest.

This set-theoretic foundation makes perfect sense: when we ask "what's the probability of rolling an even number on a die," we're really asking about the set {2,4,6} \{2, 4, 6\}, not about individual outcomes in isolation.

By treating events as sets, we gain access to the full power of set theory and algebra of sets laws for probability calculations. This mathematical framework provides elegant tools for combining and manipulating events—operations like union and intersection become natural ways to express complex probabilistic relationships, while concepts such as subsets and complements offer systematic approaches to analyzing event dependencies and exclusions.
Ω (Sample Space) A Legend: Sample Space (Ω) Event A Key Insight: The circle represents event A as a set of outcomes. Area of A P(A) = Area of Ω Probability as ratio of areas Events as Sets in Probability Theory
To visualize these relationships between events-as-sets, we use Venn diagrams—powerful tools that illustrate unions, intersections, complements, and other set operations that form the algebraic backbone of probability theory.

Basic Axioms of Probability

The three Kolmogorov axioms provide a minimal yet complete framework for assigning consistent probabilities to events, laying the groundwork for all of probability theory. From these principles flow essential rules—such as the addition rule for disjoint events, the definition of conditional probability, and Bayes’ theorem—as well as many useful corollaries that drive rigorous problem‐solving in statistics, science, and engineering.
  • Non-negativity axiom
    For any event A, 0 ≤ P(A) ≤ 1.
  • Normalization axiom
    P(S)=1P(S) = 1, meaning the probabilities of all possible outcomes in S sum exactly to 1
  • Countable additivity axiom
    If A₁, A₂, … are disjoint, then P(⋃ᵢ Aᵢ) = ∑ᵢ P(Aᵢ).
Learn More

Rules of Probability

Probability rules translate the axioms of probability into practical tools for quantifying uncertainty. By systematically combining events—through complements, unions, intersections, and conditioning—they form the backbone of both classical (combinatorial) analyses and more advanced topics.
With these rules in hand, you’re ready to tackle sections on combinatorial models, discrete and continuous distributions, conditional probability, Bayesian inference, and beyond. Refer back to our overall probability breakdown to see how each subtopic weaves together in your learning journey.
Learn More

Combinatorial Probability

Why Combinatorial Counting Remains Essential
Even with powerful general tools—probability distributions, conditional‐probability identities, and set algebra theorems—the direct application of combinatorial principles is often the most effective method:

Fundamental Combinatorial Rules
Employ the basic counting principle, permutations P(n,k)P(n,k), combinations (nk)\binom{n}{k}, and related identities (e.g. the binomial coefficient) to enumerate equally likely outcomes directly.

Simplicity
When all outcomes share equal likelihood, computing (nk)\binom{n}{k} or n×(n1)×n\times(n-1)\times\cdots is more straightforward than constructing full distribution tables or applying Bayes’s theorem.

Transparency
A step-by-step combinatorial argument—through case analysis or symmetry—makes explicit how each arrangement or selection contributes to the overall probability, avoiding opaque algebraic manipulation.

Efficiency for Small Sample Spaces
In problems involving a modest number of cards, dice, or slots, direct computation of permutations or combinations typically requires fewer conceptual steps than invoking general-purpose formulas.

Conceptual Insight
Deriving results via combinatorial identities deepens understanding of why certain events are more prevalent, reinforcing intuition that may be obscured by formulaic approaches.

Problem-Specific Customization
Combinatorics allows tailored strategies—case distinctions, bijective mappings, or the inclusion–exclusion principle—adapted to a problem’s unique constraints, rather than forcing it into a universal template.

Random Variables and Distributions

As we defined earlier, sample space 𝑆𝑆 is the full list of “all that can happen” in a given experiment.
But are all outcomes equally likely?
The answer is: it depends.
As we know from everyday experience, some experiments—like flipping a fair coin or rolling a fair die—assign the same probability to each outcome, while in others certain outcomes carry more weight.
To capture how those weights are assigned—and how they change when we look at different features of the same experiment—we need the formal notion of a probability distribution.
In many problems, interest centers not on the raw outcomes themselves but on some numerical feature of those outcomes—what we call a random variable.
A Random Variable is simply a rule that assigns a number to every elementary outcome in the sample space.

By doing this, it becomes possible to talk about averages, variances and more, using the full toolbox of arithmetic and calculus.
The Probability Distribution of a random variable then tells us how likely each numerical value is to happen.

It does this by gathering together all the elementary outcomes that map to the same number (or fall into the same range) and adding up their probabilities. Even when every outcome in the sample space is equally likely—say, each face of a fair die—different choices of random variable (for example, the face value itself versus “even or odd,” or “number of sixes in two rolls”) will group those outcomes differently. As a result, each of those measurements has its own distinct distribution, reflecting the particular way it “reads” the experiment.
At its heart, working with a probability distribution is simply about deciding how to spread out your “degree of belief” over every thing that could happen, and then using that spread to answer questions about uncertainty.

Assigning weight: You begin by giving each possible outcome a nonnegative number (its weight), in such a way that all the weights add up to one.
Capturing uncertainty: Those weights encode exactly how confident you are in each outcome, from “almost impossible” to “almost certain.”
Calculating what matters: Once the weights are set, you can systematically compute things like “how much total weight falls in this region of outcomes,” or “what’s the average we’d expect,” or “how wildly outcomes vary.”
Guiding decisions: With those calculations in hand, you can compare different spreads of belief, choose actions that maximize your expected gain, or measure how risky a plan is.
There are many different probability distributions—each with its own characteristic pattern—but they can be broadly classified into two main categories: discrete distributions, which assign probabilities to countable outcomes, and continuous distributions, which use density functions over intervals of real numbers.
Discrete Distributions Continuous Distributions Used for countable data Used for measurable data
Learn More

Conditional Probability & Independence

Conditional probability is simply the chance of something happening once you already know something else has happened. It tells you how your outlook on an event shifts when you gain new information about another event's occurence.

When you learn about conditional probability, you’re really seeing how knowing that one event happened (B) changes your “bet” on another event (A). That change (or lack of change) is exactly what we mean by dependence or independence:

Dependent events:
If knowing that B occurred does change your opinion about AA, then AA and BB are dependent.
P(AB)    P(A).P(A\mid B)\;\neq\;P(A).
Or in simple words: “Once I see BB happen, my chance of AA goes up or down compared to what I thought before.”

Independent events:
If knowing that BB occurred doesn’t change your opinion about likelihood of AA, then AA and BB are independent.
P(AB)  =  P(A)P(A\mid B)\;=\;P(A)
Equivalently, the fact that BB happened gives you zero new information about AA.

The multiplication rule for independent events is:
P(AB)  =  P(A)P(B)P(A\cap B)\;=\;P(A)\,P(B)
The intuition behind it:
If two events don’t influence each other, the probability that both happen is just the product of their individual chances. In other words, to find the chance of A and B occurring together, you multiply P(A) by P(B).

Learn More

Probability Function

Almost every probability topic ultimately relies on a single idea:
A probability function describes how probability is distributed across all possible outcomes of a random variable.
This rule is called a probability function, and it is the mathematical relation that turns a vague notion of uncertainty into something precise and analyzable.

A probability function specifies how likely each possible value of a random variable is.
It determines how probability is distributed across the outcomes and forms the basis on which all familiar probability distributions are built. Whether we study simple models like coin tossing or more structured distributions that arise in statistics, the probability function is always the mechanism operating underneath.
The Central Role of the Probability Function Random Experiment (vague uncertainty) "How likely?" Probability Function P(X = x) The mathematical rule that assigns probabilities to outcomes x₁ P(x₁) x₂ P(x₂) xₙ P(xₙ) Outcome 1 Outcome 2 Outcome n Forms the basis of... Probability Distributions Binomial, Poisson, Normal, etc. Key Properties: • 0 ≤ P(x) ≤ 1 • ΣP(x) = 1 • Precise & analyzable Transforms: Vague uncertainty Mathematical precision
Because of its central role, this concept receives its own dedicated page, where the idea is developed more formally and connected to the structure of probability distributions.
If you want to understand what a distribution really is, the probability function is the natural place to begin.
Learn More

Probability Symbols Reference

Our Probability Symbols page delivers a systematic reference for notation used in probability theory and statistics. This collection serves as an essential guide for students and professionals working with statistical concepts.
The reference organizes symbols into practical categories including probability notations (P(A), P(A|B)), random variables and distributions (f_X(x), F_X(x)), and common distribution families (Bin(n,p), N(μ,σ²)). It extends to advanced topics like statistical measures (E(X), Var(X)), hypothesis testing parameters (H₀, α, p-value), and information theory metrics (H(X), I(X;Y)).
Specialized sections cover moment generating functions (M_X(t)), key probability inequalities (Markov's, Chebyshev's), Bayesian methods, and regression analysis notation—all presented with precise LaTeX formatting to support academic writing and research in probability and statistics.
Learn More

Visual Tools

Visual probability tools transform abstract mathematical concepts into interactive experiences that build intuitive understanding. By manipulating parameters and observing real-time changes in distributions, sample spaces, and probability outcomes, you develop the kind of deep, geometric intuition that makes probability theory truly click—moving beyond memorized formulas to genuine comprehension of how randomness behaves.
Explore our collection of interactive probability visualizers—each tool designed to make complex concepts tangible through hands-on experimentation:

Coin Toss Visualizers




🪙

Coin Flipper Simulator

Watch the Law of Large Numbers come alive as coin flips converge to their expected probability in real-time.

This interactive simulator lets you flip a fair or biased coin thousands of times and observe how the actual proportion of heads gravitates toward the theoretical probability. The convergence graph displays your results against confidence bands, showing that while short-term randomness creates wild fluctuations, long-term patterns are remarkably predictable. Track streaks, analyze z-scores, and experiment with different probabilities to build deep intuition about how randomness behaves at scale—transforming abstract statistical concepts like variance and standard deviation into tangible, visual experiences.

🪙🪙

Coin Sample Space Visualizer

See every possible outcome when flipping multiple coins and instantly calculate probabilities for any pattern or condition.

This visualizer generates the complete sample space for 1-6 coin flips, displaying all possible outcomes as an interactive grid. Highlight specific events—like "exactly 2 heads," "runs of 3 tails," or "alternating patterns"—and watch as the tool instantly calculates favorable outcomes and probabilities. By making abstract counting principles visual and interactive, you build concrete intuition for how sample spaces work, why we count outcomes the way we do, and how theoretical probabilities emerge from the structure of equally likely events.

Dice Roll Visualizers




🎲

Dice Roll Simulator

Roll multiple dice thousands of times and watch distributions emerge from randomness as sums converge to their theoretical expected values.

This simulator lets you roll 1-6 standard dice simultaneously, accumulating results to reveal how frequency distributions match theoretical probability patterns. Compare actual histograms against expected distributions, track how average sums converge to 3.5 per die, and analyze variance and z-scores across thousands of rolls. Whether exploring the Central Limit Theorem or understanding why certain sums are more common in board games, this tool transforms abstract dice probability into concrete visual patterns you can manipulate and observe in real-time.

🎲🎲

Dice Sample Space Explorer

Visualize every possible dice outcome and filter by complex conditions to calculate exact probabilities for any event you can imagine.

This explorer generates the complete sample space for 1-4 dice rolls, displaying all possible outcomes with interactive dice visuals and letting you highlight events using sophisticated filters—from simple conditions like "sum equals 7" to complex patterns like "all dice in ascending order" or "at least two 6s with an even sum." Watch favorable outcomes light up as you adjust parameters, see probability calculations update instantly, and build intuition for why certain dice combinations are more likely than others. Perfect for understanding conditional probability, combinatorics, and the mathematical structure underlying dice games.

Venn Diagrams Visualizers




2-Set Venn Diagram Solver

Solve probability problems involving two events by visualizing how marginal probabilities and constraints determine all four regions of a Venn diagram.

This interactive solver takes marginal probabilities for two events and additional constraints (like intersection or complement probabilities) and calculates the exact probability for all four regions of the sample space. Click diagram segments to highlight specific outcomes, view step-by-step calculations showing how each region is derived from the given information, and explore multiple real-world scenarios from student surveys to medical testing. Perfect for understanding how conditional information propagates through a probability space and building intuition for the algebra of events.

3-Set Venn Diagram Solver

Tackle complex three-event probability problems by solving systems of equations to find all eight regions where events intersect and complement each other.

This advanced solver handles the full complexity of three-event probability problems, where you must determine eight distinct regions from marginal probabilities and intersection constraints. Watch as the tool works through systems of equations to compute values like P(A∩B∩C) and P(A∩Bᶜ∩C), with detailed calculation steps showing the logical dependency chain from given information to derived probabilities. The interactive Venn diagram lets you click any of the eight segments to see its specific calculation, making the abstract algebra of three-set problems concrete and visual. Essential for mastering advanced probability problems involving multiple overlapping events.

Variance Visualizer




📈

Interactive Variance Calculator

Drag data points and watch variance update in real-time as deviations from the mean become visually tangible through interactive bars and step-by-step calculations.

This powerful visualizer transforms the abstract concept of variance into something you can literally see and feel. Drag points up or down on the chart to instantly see how each value's distance from the mean contributes to overall spread—colored deviation bars grow and shrink dynamically, squared deviations update in the table, and the complete calculation chain unfolds step-by-step on the right. Switch between population (σ²) and sample (s²) variance to understand why we divide by n-1, explore presets that demonstrate low versus high variance datasets, and hover over any element for contextual tooltips. Perfect for building deep intuition about why we square deviations, how outliers disproportionately affect variance, and what standard deviation really measures.

Expected Value Visualizers




⚖️

Weighted Average Expected Value

See how probabilities literally "pull" the expected value toward them through an intuitive weight-and-force visualization that makes weighted averages tangible.

This unique visualizer uses a physical metaphor to show why expected value is called a weighted average. Each outcome appears as a "weight" on a number line, with circle size and arrow thickness representing probability—high-probability outcomes exert stronger "pull" on the expected value. Watch the blue E(X) marker get tugged toward heavy weights while comparing it to the gray unweighted average that ignores probabilities entirely. Use the animation feature to cycle through preset distributions, or manually select scenarios like "Strong Right Bias" to see how asymmetric probabilities shift expected value dramatically. Perfect for building visceral intuition about why E(X) ≠ simple average and how probability weighting fundamentally changes where the "center" of a distribution lies.

📊

Discrete Expected Value Calculator

Adjust probability distributions with sliders and watch the expected value update instantly as bar heights and contributions recalculate in real-time.

This interactive calculator displays discrete probability distributions as bar charts with a distinctive red dashed line marking the expected value position. Drag probability sliders for each outcome and watch automatic normalization ensure probabilities sum to 1.0, while each bar simultaneously displays three key pieces of information: the probability P(X=x), the contribution x·P(X=x), and the visual height representing likelihood. The formula E[X] = Σ x·P(X=x) becomes concrete as you see individual contributions sum to the total expected value, and the red line shifts left or right based on your probability adjustments. Ideal for experimenting with different distributions and understanding how expected value emerges from the weighted sum of outcomes.

Probability Inequalities Visualizers




Markov Inequality Visualizer

Watch how Markov's inequality bounds tail probabilities using only the expected value—no variance or distribution shape required.

This visualizer demonstrates Markov's inequality (P(X ≥ a) ≤ E[X]/a) across nine different distributions, both continuous and discrete. Adjust the expected value and threshold to see the bound in action, with red shaded regions showing the actual tail probability beyond your threshold. The tool highlights a critical insight: when a ≤ E[X], the bound becomes useless (≥1), warning you with red alerts. Watch how the bound gets tighter as the threshold increases relative to the mean, and compare actual probabilities to theoretical bounds across exponential, normal, Poisson, binomial, and other distributions. Perfect for understanding when Markov provides useful information and when it's too loose to be practical.

Chebyshev Inequality Visualizer

Explore how Chebyshev's inequality bounds deviations from the mean using variance, providing guarantees that work for any distribution.

This powerful visualizer shows Chebyshev's inequality (P(|X - μ| ≥ a) ≤ σ²/a²) in action, highlighting both tail regions symmetrically around the mean. Unlike Markov, Chebyshev uses variance to provide much tighter bounds on how far values can stray from the mean—regardless of distribution shape. Adjust mean, variance, and deviation threshold independently to see how the bound tightens as you move further from center or as variance decreases. Red shaded regions show both tails combined, demonstrating that at least (1 - σ²/a²) of the probability mass must lie within a standard deviations of the mean. Test with normal, exponential, uniform, Poisson, and other distributions to see this distribution-free bound consistently hold, building intuition for why Chebyshev is fundamental to concentration inequalities.

Learn More