Probability
Introduction to Probability Section
Probability is a field of mathematics that deals with uncertainty and provides tools to measure and analyze how likely events are to occur. It begins with basic concepts such as outcomes, events, and sample spaces, forming the foundation for calculating likelihoods.Central to probability is the concept of probability measures, which assign values between 0 and 1 to events, indicating their likelihood. A value of 0 means an event is impossible, while 1 signifies certainty. Key principles include independence (events that do not influence each other) and conditional probability, which considers the likelihood of an event given that another has occurred.
Probability also introduces random variables, which assign numerical values to outcomes. These variables are categorized as either discrete (taking specific values, like rolling a dice) or continuous(taking any value within a range, like measuring temperature). Important measures such as expectancy(average value) and variance(spread or variability) are used to summarize the behavior of random variables.
Advanced topics include distributions, such as the binomial, normal, and Poisson distributions, which model specific types of random phenomena. These tools are essential for understanding patterns in random processes and making informed predictions.
Probability is widely applied in science, engineering, finance, and everyday decision-making. It forms the basis for statistics, enabling data-driven insights and predictions, and supports fields like machine learning, risk analysis, and quantum mechanics. By studying probability, students develop skills to reason about uncertainty and draw conclusions from incomplete information.
Probability Formulas
This page presents essential probability formulas organized by categories, ranging from basic principles to advanced distributions. Each formula includes detailed explanations, example calculations, and practical use cases, making it a helpful resource for students and practitioners working with probability theory and statistical analysis.
Probability Terms and Definitions
Browse Probability terminology including main concepts and their definitions with examples .A structured guide to probability theory terms and concepts, progressing from foundational definitions through set theory, random variables, and complex distributions. The content covers both theoretical aspects and practical applications, making probability concepts more accessible for study and reference.
Main Concepts
The Sample Space (Ω), is the collection of all different results that the experiment may produce.
The sample space can be finite (for example, in the dice‐rolling scenario or coin flipping) or infinite (for instance, selecting a real number).
In addition, we may divide sample spaces by outcome types into discrete or continuous.
Often times, defining or calculating proper sample space for any given case may pose sertious challenge and demands experience and certain analytic skills.
Although in the case of a dice roll the collection of possible outcomes may seem self‐evident, the sample space plays an important role in conducting more complex experiments. Typically, a researcher will take the sample space and partition it into subsets in order to draw various conclusions.
In any practical application, accurately defining the sample space is essential to solving probability problems.
Probability Event is simply any subset of the sample space.
Example:
In case of dice roll the sample space would be
Some possible events:
Event (rolling an even number)
Event (rolling exactly 5)
Event (any outcome - certain event)
Event (impossible event)
As the definition states and the example shows, probability event may include one or more outcomes.It is a set of results counting as one event.
Probability is a function that assigns to each event in the sample space a real number in where total probability value of the entire sample space .
This number is calculated as a ratio
Probability function satisfies three basic axioms of probability.
Set Theory & Event Algebra
When we conceptualize probability, we naturally think of sample spaces as collections of individual outcomes—dots scattered across our mathematical landscape.
However, this intuitive picture presents a fundamental challenge: if we treat each outcome as a geometric point, it has zero area by definition.
Consider the classical probability formula:
.
If we literally counted individual points (dots), each with zero "probability mass," we'd face the paradox that every single outcome has probability zero, yet their sum must equal one.
If we literally counted individual points (dots), each with zero "probability mass," we'd face the paradox that every single outcome has probability zero, yet their sum must equal one.
This apparent contradiction reveals why probability theory fundamentally operates with sets rather than isolated points. We don't manipulate individual outcomes; instead, we work with collections or groups of outcomes. An event isn't a single dot—it's a set of possible outcomes that satisfy our condition of interest.
This set-theoretic foundation makes perfect sense: when we ask "what's the probability of rolling an even number on a die," we're really asking about the set , not about individual outcomes in isolation.
By treating events as sets, we gain access to the full power of set theory and algebra of sets laws for probability calculations. This mathematical framework provides elegant tools for combining and manipulating events—operations like union and intersection become natural ways to express complex probabilistic relationships, while concepts such as subsets and complements offer systematic approaches to analyzing event dependencies and exclusions.
To visualize these relationships between events-as-sets, we use Venn diagrams—powerful tools that illustrate unions, intersections, complements, and other set operations that form the algebraic backbone of probability theory.
Basic Axioms of Probability
The three Kolmogorov axioms provide a minimal yet complete framework for assigning consistent probabilities to events, laying the groundwork for all of probability theory. From these principles flow essential rules—such as the addition rule for disjoint events, the definition of conditional probability, and Bayes’ theorem—as well as many useful corollaries that drive rigorous problem‐solving in statistics, science, and engineering.
- •Non-negativity axiom
For any event A, 0 ≤ P(A) ≤ 1. - •Normalization axiom
, meaning the probabilities of all possible outcomes in S sum exactly to 1 - •Countable additivity axiom
If A₁, A₂, … are disjoint, then P(⋃ᵢ Aᵢ) = ∑ᵢ P(Aᵢ).
Rules of Probability
Probability rules translate the axioms of probability into practical tools for quantifying uncertainty. By systematically combining events—through complements, unions, intersections, and conditioning—they form the backbone of both classical (combinatorial) analyses and more advanced topics.
Basic Axiomatic Properties
Set-Operation Rules
- Complement RuleThe probability of the complement of A equals one minus the probability of A.
- Subset AbsorptionIf B ⊆ A, then A ∩ B has probability P(B) and A ∪ B has probability P(A).
- Complement AbsorptionIf A ⊆ Bᶜ, then A ∩ Bᶜ has probability P(A) and A ∪ Bᶜ has probability P(Bᶜ).
Additive & Inequality Rules
- Inclusion–Exclusion PrincipleGeneral method for P(⋃Ai) by alternately adding and subtracting intersections.
Multiplicative & Conditional Rules
- Multiplication & Chain RulesCompute joint probabilities via P(A ∩ B)=P(B)P(A|B) and its n-term extension.
With these rules in hand, you’re ready to tackle sections on combinatorial models, discrete and continuous distributions, conditional probability, Bayesian inference, and beyond. Refer back to our overall probability breakdown to see how each subtopic weaves together in your learning journey.
Combinatorial Probability
Why Combinatorial Counting Remains Essential
Even with powerful general tools—probability distributions, conditional‐probability identities, and set algebra theorems—the direct application of combinatorial principles is often the most effective method:
Fundamental Combinatorial Rules
Employ the basic counting principle, permutations , combinations , and related identities (e.g. the binomial coefficient) to enumerate equally likely outcomes directly.
Simplicity
When all outcomes share equal likelihood, computing or is more straightforward than constructing full distribution tables or applying Bayes’s theorem.
Transparency
A step-by-step combinatorial argument—through case analysis or symmetry—makes explicit how each arrangement or selection contributes to the overall probability, avoiding opaque algebraic manipulation.
Efficiency for Small Sample Spaces
In problems involving a modest number of cards, dice, or slots, direct computation of permutations or combinations typically requires fewer conceptual steps than invoking general-purpose formulas.
Conceptual Insight
Deriving results via combinatorial identities deepens understanding of why certain events are more prevalent, reinforcing intuition that may be obscured by formulaic approaches.
Problem-Specific Customization
Combinatorics allows tailored strategies—case distinctions, bijective mappings, or the inclusion–exclusion principle—adapted to a problem’s unique constraints, rather than forcing it into a universal template.
Even with powerful general tools—probability distributions, conditional‐probability identities, and set algebra theorems—the direct application of combinatorial principles is often the most effective method:
Fundamental Combinatorial Rules
Employ the basic counting principle, permutations , combinations , and related identities (e.g. the binomial coefficient) to enumerate equally likely outcomes directly.
Simplicity
When all outcomes share equal likelihood, computing or is more straightforward than constructing full distribution tables or applying Bayes’s theorem.
Transparency
A step-by-step combinatorial argument—through case analysis or symmetry—makes explicit how each arrangement or selection contributes to the overall probability, avoiding opaque algebraic manipulation.
Efficiency for Small Sample Spaces
In problems involving a modest number of cards, dice, or slots, direct computation of permutations or combinations typically requires fewer conceptual steps than invoking general-purpose formulas.
Conceptual Insight
Deriving results via combinatorial identities deepens understanding of why certain events are more prevalent, reinforcing intuition that may be obscured by formulaic approaches.
Problem-Specific Customization
Combinatorics allows tailored strategies—case distinctions, bijective mappings, or the inclusion–exclusion principle—adapted to a problem’s unique constraints, rather than forcing it into a universal template.
Random Variables and Distributions
As we defined earlier, sample space is the full list of “all that can happen” in a given experiment.
But are all outcomes equally likely?
The answer is: it depends.
As we know from everyday experience, some experiments—like flipping a fair coin or rolling a fair die—assign the same probability to each outcome, while in others certain outcomes carry more weight.
To capture how those weights are assigned—and how they change when we look at different features of the same experiment—we need the formal notion of a probability distribution.
In many problems, interest centers not on the raw outcomes themselves but on some numerical feature of those outcomes—what we call a random variable.
By doing this, it becomes possible to talk about averages, variances and more, using the full toolbox of arithmetic and calculus.
It does this by gathering together all the elementary outcomes that map to the same number (or fall into the same range) and adding up their probabilities. Even when every outcome in the sample space is equally likely—say, each face of a fair die—different choices of random variable (for example, the face value itself versus “even or odd,” or “number of sixes in two rolls”) will group those outcomes differently. As a result, each of those measurements has its own distinct distribution, reflecting the particular way it “reads” the experiment.
At its heart, working with a probability distribution is simply about deciding how to spread out your “degree of belief” over every thing that could happen, and then using that spread to answer questions about uncertainty.
* Assigning weight: You begin by giving each possible outcome a nonnegative number (its weight), in such a way that all the weights add up to one.
* Capturing uncertainty: Those weights encode exactly how confident you are in each outcome, from “almost impossible” to “almost certain.”
* Calculating what matters: Once the weights are set, you can systematically compute things like “how much total weight falls in this region of outcomes,” or “what’s the average we’d expect,” or “how wildly outcomes vary.”
* Guiding decisions: With those calculations in hand, you can compare different spreads of belief, choose actions that maximize your expected gain, or measure how risky a plan is.
There are many different probability distributions—each with its own characteristic pattern—but they can be broadly classified into two main categories: discrete distributions, which assign probabilities to countable outcomes, and continuous distributions, which use density functions over intervals of real numbers.
But are all outcomes equally likely?
The answer is: it depends.
As we know from everyday experience, some experiments—like flipping a fair coin or rolling a fair die—assign the same probability to each outcome, while in others certain outcomes carry more weight.
To capture how those weights are assigned—and how they change when we look at different features of the same experiment—we need the formal notion of a probability distribution.
In many problems, interest centers not on the raw outcomes themselves but on some numerical feature of those outcomes—what we call a random variable.
A Random Variable is simply a rule that assigns a number to every elementary outcome in the sample space.
By doing this, it becomes possible to talk about averages, variances and more, using the full toolbox of arithmetic and calculus.
The Probability Distribution of a random variable then tells us how likely each numerical value is to happen.
It does this by gathering together all the elementary outcomes that map to the same number (or fall into the same range) and adding up their probabilities. Even when every outcome in the sample space is equally likely—say, each face of a fair die—different choices of random variable (for example, the face value itself versus “even or odd,” or “number of sixes in two rolls”) will group those outcomes differently. As a result, each of those measurements has its own distinct distribution, reflecting the particular way it “reads” the experiment.
At its heart, working with a probability distribution is simply about deciding how to spread out your “degree of belief” over every thing that could happen, and then using that spread to answer questions about uncertainty.
* Assigning weight: You begin by giving each possible outcome a nonnegative number (its weight), in such a way that all the weights add up to one.
* Capturing uncertainty: Those weights encode exactly how confident you are in each outcome, from “almost impossible” to “almost certain.”
* Calculating what matters: Once the weights are set, you can systematically compute things like “how much total weight falls in this region of outcomes,” or “what’s the average we’d expect,” or “how wildly outcomes vary.”
* Guiding decisions: With those calculations in hand, you can compare different spreads of belief, choose actions that maximize your expected gain, or measure how risky a plan is.
There are many different probability distributions—each with its own characteristic pattern—but they can be broadly classified into two main categories: discrete distributions, which assign probabilities to countable outcomes, and continuous distributions, which use density functions over intervals of real numbers.
Conditional Probability & Independence
Conditional probability is simply the chance of something happening once you already know something else has happened. It tells you how your outlook on an event shifts when you gain new information about another event's occurence.
When you learn about conditional probability, you’re really seeing how knowing that one event happened (B) changes your “bet” on another event (A). That change (or lack of change) is exactly what we mean by dependence or independence:
Dependent events:
If knowing that B occurred does change your opinion about , then and are dependent.
Or in simple words: “Once I see happen, my chance of goes up or down compared to what I thought before.”
Independent events:
If knowing that occurred doesn’t change your opinion about likelihood of , then and are independent.
Equivalently, the fact that happened gives you zero new information about .
The multiplication rule for independent events is:
The intuition behind it:
If two events don’t influence each other, the probability that both happen is just the product of their individual chances. In other words, to find the chance of A and B occurring together, you multiply P(A) by P(B).
Dependent events:
If knowing that B occurred does change your opinion about , then and are dependent.
Or in simple words: “Once I see happen, my chance of goes up or down compared to what I thought before.”
Independent events:
If knowing that occurred doesn’t change your opinion about likelihood of , then and are independent.
Equivalently, the fact that happened gives you zero new information about .
The multiplication rule for independent events is:
The intuition behind it:
If two events don’t influence each other, the probability that both happen is just the product of their individual chances. In other words, to find the chance of A and B occurring together, you multiply P(A) by P(B).
Probability Symbols Reference
Our Probability Symbols page delivers a systematic reference for notation used in probability theory and statistics. This collection serves as an essential guide for students and professionals working with statistical concepts.
The reference organizes symbols into practical categories including probability notations (P(A), P(A|B)), random variables and distributions (f_X(x), F_X(x)), and common distribution families (Bin(n,p), N(μ,σ²)). It extends to advanced topics like statistical measures (E(X), Var(X)), hypothesis testing parameters (H₀, α, p-value), and information theory metrics (H(X), I(X;Y)).
Specialized sections cover moment generating functions (M_X(t)), key probability inequalities (Markov's, Chebyshev's), Bayesian methods, and regression analysis notation—all presented with precise LaTeX formatting to support academic writing and research in probability and statistics.
The reference organizes symbols into practical categories including probability notations (P(A), P(A|B)), random variables and distributions (f_X(x), F_X(x)), and common distribution families (Bin(n,p), N(μ,σ²)). It extends to advanced topics like statistical measures (E(X), Var(X)), hypothesis testing parameters (H₀, α, p-value), and information theory metrics (H(X), I(X;Y)).
Specialized sections cover moment generating functions (M_X(t)), key probability inequalities (Markov's, Chebyshev's), Bayesian methods, and regression analysis notation—all presented with precise LaTeX formatting to support academic writing and research in probability and statistics.