Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Normal Distribution






Normal Distribution: Natural Variability Around a Typical Value


Many real-world measurements arise from the accumulation of many small, independent influences acting together. The probabilistic experiment behind the normal distribution describes observing a quantity that fluctuates continuously around a typical or central value, with deviations in either direction becoming less likely as they grow larger. The random variable represents a measurement, not a count, and extreme values are possible but increasingly rare.



The Probabilistic Experiment Behind normal distribution


The probabilistic experiment underlying the normal distribution arises when an outcome is shaped by many small, independent influences, none of which dominates the result. Each influence nudges the outcome slightly upward or downward, and the final value reflects the combined effect of all these contributions.

This experiment is not defined by repetition of identical trials, but by aggregation. The core assumption is that deviations occur naturally in both directions, are roughly symmetric, and tend to cancel out when added together. Extreme outcomes are possible, but increasingly unlikely because they require many influences to align in the same direction.

The normal distribution emerges as a consequence of stability: when numerous independent factors interact, the resulting variability concentrates around a central value. The spread reflects how strong those individual influences are, while the center reflects their average balance point.

This experiment explains why many natural and measurement-based quantities cluster around a typical value with gradual falloff on both sides.

Example:

Human height results from genetics, nutrition, environment, and random biological variation. No single factor determines the outcome, but together they produce values concentrated around an average, with fewer extremely short or tall individuals.

Notation


XN(μ,σ2)X \sim N(\mu, \sigma^2) — distribution of the random variable (variance notation).

XNormal(μ,σ2)X \sim \text{Normal}(\mu, \sigma^2) — alternative explicit form.

N(μ,σ2)N(\mu, \sigma^2) — used to denote the distribution itself (not the random variable
).

N(0,1)N(0, 1) — the standard normal distribution (μ=0,σ=1\mu = 0, \sigma = 1).

ZN(0,1)Z \sim N(0, 1) — conventional notation for a standard normal random variable
.

Note: Some texts use N(μ,σ)N(\mu, \sigma) with standard deviation instead of variance. Always check which convention is being used. Statistical software often defaults to variance notation.

See All Probability Symbols and Notations

Parameters


μ (mu): mean or center of the distribution, where μR\mu \in \mathbb{R}

σ (sigma): standard deviation, measuring spread around the mean, where σ>0\sigma > 0

The normal distribution is fully characterized by these two parameters.

μ determines the location (where the peak sits on the number line), while σ controls the spread (how wide or narrow the bell curve is).

Variance is σ2\sigma^2, but we typically use σ\sigma as the primary parameter since it's in the same units as the data.

Probability Density Function (PDF) and Support (Range)


Probability Density Function (PDF)


The probability density function (PDF) of a normal distribution is given by:

f(x)=1σ2πe(xμ)22σ2,<x<f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}, \quad -\infty < x < \infty


Intuition Behind the Formula


Bell-Shaped Curve: The normal distribution creates a symmetric, bell-shaped curve centered at μ\mu.

Parameters:
μ\mu: The mean determines where the center of the bell sits
σ\sigma: The standard deviation controls the width of the bell
σ2\sigma^2: The variance (σ\sigma squared) appears in the exponent

Support (Range of the Random Variable):
• The random variable XX can take any real value: (,+)(-\infty, +\infty)
• While theoretically unbounded, approximately 99.7% of values fall within μ±3σ\mu \pm 3\sigma
• The support is the entire real line

Logic Behind the Formula:
1σ2π\frac{1}{\sigma\sqrt{2\pi}}: Normalization constant ensuring total area equals 1
(xμ)2(x-\mu)^2: Squared distance from the mean (makes curve symmetric)
e(xμ)22σ2e^{-\frac{(x-\mu)^2}{2\sigma^2}}: Exponential decay as you move away from μ\mu
• The total area under the curve equals 1:

f(x)dx=1\int_{-\infty}^{\infty} f(x)\,dx = 1


Practical Example: Human heights follow approximately a normal distribution. If adult male heights have μ=175\mu = 175 cm and σ=7\sigma = 7 cm, then XN(175,49)X \sim N(175, 49). The PDF tells us the relative likelihood of observing different heights, with the peak at 175 cm and decreasing probability as we move toward very short or very tall individuals.

Normal (Gaussian) Distribution

Bell-shaped curve, symmetric around mean

Explanation

The normal distribution, also known as the Gaussian distribution, is the most important probability distribution in statistics. Its probability density function is f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}, where μ\mu is the mean and σ\sigma is the standard deviation. The expected value is E[X]=μE[X] = \mu and the variance is Var(X)=σ2\text{Var}(X) = \sigma^2. The normal distribution appears naturally in many phenomena due to the Central Limit Theorem. Applications include measurement errors, heights and weights in populations, IQ scores, financial returns, and any process that results from many small independent random effects.


Cumulative Distribution Function (CDF)

Cumulative Distribution Function (CDF)


The cumulative distribution function (CDF) gives the probability that XX is less than or equal to a specific value:

F(x)=P(Xx)=x1σ2πe(tμ)22σ2dtF(x) = P(X \leq x) = \int_{-\infty}^{x} \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(t-\mu)^2}{2\sigma^2}}\,dt


Standard Normal CDF: For the standard normal distribution N(0,1)N(0, 1), the CDF is traditionally denoted by Φ(z)\Phi(z):

Φ(z)=P(Zz)=z12πet22dt\Phi(z) = P(Z \leq z) = \int_{-\infty}^{z} \frac{1}{\sqrt{2\pi}} e^{-\frac{t^2}{2}}\,dt


Key Properties:
F()=0F(-\infty) = 0 and F(+)=1F(+\infty) = 1
F(μ)=0.5F(\mu) = 0.5 (the mean is the 50th percentile)
• The CDF is strictly increasing and S-shaped
• Any normal CDF can be expressed using the standard normal: F(x)=Φ(xμσ)F(x) = \Phi\left(\frac{x-\mu}{\sigma}\right)

Practical Use: The normal CDF has no closed-form expression, so we use tables, calculators, or software. To find P(X180)P(X \leq 180) when XN(175,49)X \sim N(175, 49), convert to standard normal: Z=(180175)/70.714Z = (180-175)/7 \approx 0.714, then look up Φ(0.714)0.762\Phi(0.714) \approx 0.762, meaning about 76.2% of values fall below 180.

Normal Distribution CDF

Visualizing probability accumulation for normal (Gaussian) distribution

Normal (Gaussian) - CDF

S-shaped curve, steepest at mean

CDF Explanation

The cumulative distribution function (CDF) of the normal distribution is F(x)=12[1+erf(xμσ2)]F(x) = \frac{1}{2}\left[1 + \text{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)\right], where erf is the error function. The CDF gives the probability P(Xx)P(X \leq x) that a normally distributed random variable is less than or equal to xx. The S-shaped curve is symmetric around the mean μ\mu, where F(μ)=0.5F(\mu) = 0.5. The curve is steepest at the mean and flattens out in the tails. About 68% of values fall within one standard deviation of the mean (F(μ+σ)F(μσ)0.68F(\mu + \sigma) - F(\mu - \sigma) \approx 0.68), 95% within two standard deviations, and 99.7% within three standard deviations.

Expected Value (Mean)


The mean of a continuous random variable requires integrating across all possible values, with each value weighted by its density. For the normal distribution, this computation follows the standard continuous expected value formula:

E[X]=xf(x)dxE[X] = \int_{-\infty}^{\infty} x \cdot f(x) \, dx


Formula


E[X]=μE[X] = \mu


Where:
μ\mu = the location parameter of the distribution

Derivation and Intuition


The normal distribution has PDF:

f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}


Computing the expected value:

E[X]=x1σ2πe(xμ)22σ2dxE[X] = \int_{-\infty}^{\infty} x \cdot \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \, dx


Using the substitution z=xμσz = \frac{x-\mu}{\sigma}, we get x=μ+σzx = \mu + \sigma z and dx=σdzdx = \sigma \, dz:

E[X]=(μ+σz)12πez22dzE[X] = \int_{-\infty}^{\infty} (\mu + \sigma z) \cdot \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}} \, dz


E[X]=μ12πez22dz+σz12πez22dzE[X] = \mu \int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}} \, dz + \sigma \int_{-\infty}^{\infty} z \cdot \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}} \, dz


The first integral equals 1 (total probability of standard normal). The second integral equals 0 by symmetry (integrating an odd function over a symmetric interval).

Therefore: E[X]=μ1+σ0=μE[X] = \mu \cdot 1 + \sigma \cdot 0 = \mu

The result E[X]=μE[X] = \mu reveals a fundamental property of the normal distribution: the parameter μ\mu directly represents the mean. The distribution is perfectly symmetric around μ\mu, making it simultaneously the center of mass, the balance point, and the most likely region of values.

Example


Consider human heights modeled as normally distributed with μ=170\mu = 170 cm and σ=10\sigma = 10 cm:

E[X]=170 cmE[X] = 170 \text{ cm}


The expected height is exactly 170 cm, which is the center of the distribution. Half of all heights fall above this value, and half fall below it.

Variance and Standard Deviation


The variance of a continuous random variable quantifies the spread of values around the mean. For continuous distributions, it is calculated through integration:

Var(X)=E[(Xμ)2]=(xμ)2f(x)dx\mathrm{Var}(X) = \mathbb{E}[(X - \mu)^2] = \int_{-\infty}^{\infty} (x - \mu)^2 f(x) \, dx


Alternatively, using the computational formula:

Var(X)=E[X2]μ2\mathrm{Var}(X) = \mathbb{E}[X^2] - \mu^2


For the normal distribution, this calculation yields a direct relationship with the distribution parameter.

Formula


Var(X)=σ2\mathrm{Var}(X) = \sigma^2


Where:
σ\sigma = the scale parameter of the distribution

Derivation and Intuition


The normal distribution has PDF:

f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}


We know from the expected value section that μ=E[X]\mu = E[X]. Computing the variance:

Var(X)=(xμ)21σ2πe(xμ)22σ2dx\mathrm{Var}(X) = \int_{-\infty}^{\infty} (x - \mu)^2 \cdot \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \, dx


Using the substitution z=xμσz = \frac{x-\mu}{\sigma}, we get (xμ)=σz(x-\mu) = \sigma z and dx=σdzdx = \sigma \, dz:

Var(X)=(σz)212πez22dz=σ2z212πez22dz\mathrm{Var}(X) = \int_{-\infty}^{\infty} (\sigma z)^2 \cdot \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}} \, dz = \sigma^2 \int_{-\infty}^{\infty} z^2 \cdot \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}} \, dz


The integral equals 1 (this is the variance of the standard normal distribution).

Therefore: Var(X)=σ21=σ2\mathrm{Var}(X) = \sigma^2 \cdot 1 = \sigma^2

The result Var(X)=σ2\mathrm{Var}(X) = \sigma^2 reveals that the parameter σ\sigma directly controls the spread of the distribution. Just as μ\mu determines the center, σ2\sigma^2 determines how dispersed values are around that center. A larger σ\sigma produces a wider, flatter bell curve; a smaller σ\sigma produces a narrow, peaked distribution.

Standard Deviation


σ\sigma


The standard deviation is simply the parameter σ\sigma itself, which is why it's called the standard deviation parameter. This makes interpretation straightforward: about 68% of values fall within one standard deviation of the mean, about 95% within two standard deviations, and about 99.7% within three standard deviations.

Example


Consider human heights modeled as normally distributed with μ=170\mu = 170 cm and σ=10\sigma = 10 cm:

Var(X)=(10)2=100 cm2\mathrm{Var}(X) = (10)^2 = 100 \text{ cm}^2


σ=10 cm\sigma = 10 \text{ cm}


The variance of 100 cm² and standard deviation of 10 cm indicate that most heights cluster within 10 cm of the mean. We expect about 68% of heights to fall between 160 cm and 180 cm, and about 95% between 150 cm and 190 cm.

Mode and Median

Mode


The mode is the value where the probability density function reaches its maximum—the peak of the distribution curve.

For the normal distribution, the mode is:

Mode=μ\text{Mode} = \mu


Intuition: The normal PDF is:

f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}


The exponential term e(xμ)22σ2e^{-\frac{(x-\mu)^2}{2\sigma^2}} is maximized when the exponent equals zero, which occurs when (xμ)2=0(x-\mu)^2 = 0, giving x=μx = \mu.

The normal distribution is perfectly symmetric around μ\mu, making this point simultaneously the center of mass, the highest point, and the balance point. No other value has higher probability density.

Example:
For human heights with μ=170\mu = 170 cm and σ=10\sigma = 10 cm:

Mode = 170 cm

This is the most common height—the peak of the bell curve. Heights become progressively less common as you move away from 170 cm in either direction.

Median


The median is the value mm such that P(Xm)=0.5P(X \leq m) = 0.5—the point that divides the distribution's probability in half.

For the normal distribution, the median is:

Median=μ\text{Median} = \mu


Intuition: Because the normal distribution is perfectly symmetric around μ\mu, exactly half the probability mass lies below μ\mu and half lies above. The CDF evaluated at μ\mu gives:

F(μ)=0.5F(\mu) = 0.5


This symmetry means that mode = median = mean = μ\mu, a unique property that holds only for symmetric distributions.

Example:
For human heights with μ=170\mu = 170 cm and σ=10\sigma = 10 cm:

Median = 170 cm

Half of all heights fall below 170 cm, and half fall above. This coincides with both the mean and mode.

Properties:
• For the normal distribution: mode = median = mean (all equal μ\mu)
• The parameter σ\sigma controls spread but doesn't affect the location of mode or median
• This triple equality holds for all symmetric distributions but is particularly important for the normal distribution
• Unlike skewed distributions where mean, median, and mode diverge, the normal distribution's symmetry keeps them aligned

Quantiles/Percentiles


A quantile is a value that divides the distribution at a specific probability threshold. The pp-th quantile xpx_p satisfies:

P(Xxp)=pP(X \leq x_p) = p


where 0<p<10 < p < 1.

Percentiles are quantiles expressed as percentages: the kk-th percentile corresponds to the quantile at p=k/100p = k/100. For example, the 25th percentile is the 0.25 quantile, the 50th percentile is the median, and the 75th percentile is the 0.75 quantile.

Quantiles are found by inverting the CDF: if F(xp)=pF(x_p) = p, then xp=F1(p)x_p = F^{-1}(p).

Finding Quantiles for the Normal Distribution


For a normal distribution with mean μ\mu and standard deviation σ\sigma, the pp-th quantile is:

xp=μ+σzpx_p = \mu + \sigma \cdot z_p


where zpz_p is the pp-th quantile of the standard normal distribution (mean 0, standard deviation 1).

The standard normal quantiles zpz_p cannot be expressed in closed form and must be obtained from:
• Statistical tables (z-tables)
• Software functions (e.g., qnorm() in R, norm.ppf() in Python)
• Numerical approximations

Common Percentiles


25th Percentile (First Quartile, Q1):

x0.25=μ+σz0.25=μ0.674σx_{0.25} = \mu + \sigma \cdot z_{0.25} = \mu - 0.674\sigma


About 25% of values fall below this point.

50th Percentile (Median, Q2):

x0.50=μ+σz0.50=μx_{0.50} = \mu + \sigma \cdot z_{0.50} = \mu


This is the median, dividing the distribution in half.

75th Percentile (Third Quartile, Q3):

x0.75=μ+σz0.75=μ+0.674σx_{0.75} = \mu + \sigma \cdot z_{0.75} = \mu + 0.674\sigma


About 75% of values fall below this point.

Interquartile Range (IQR):

IQR=Q3Q1=1.349σ\text{IQR} = Q3 - Q1 = 1.349\sigma


The IQR contains the middle 50% of the distribution.

Example


For human heights with μ=170\mu = 170 cm and σ=10\sigma = 10 cm:

25th percentile: 170+10(0.674)=1706.74=163.26170 + 10(-0.674) = 170 - 6.74 = 163.26 cm

25% of people are shorter than 163.26 cm.

50th percentile: 170+10(0)=170170 + 10(0) = 170 cm

Half of people are shorter than 170 cm (the median).

75th percentile: 170+10(0.674)=170+6.74=176.74170 + 10(0.674) = 170 + 6.74 = 176.74 cm

75% of people are shorter than 176.74 cm.

IQR: 176.74163.26=13.48176.74 - 163.26 = 13.48 cm

The middle 50% of heights span about 13.5 cm.

Other Notable Percentiles


90th percentile: x0.90=μ+1.282σx_{0.90} = \mu + 1.282\sigma (only 10% exceed this value)

95th percentile: x0.95=μ+1.645σx_{0.95} = \mu + 1.645\sigma (only 5% exceed this value)

99th percentile: x0.99=μ+2.326σx_{0.99} = \mu + 2.326\sigma (only 1% exceed this value)

These percentiles are commonly used in hypothesis testing and confidence interval construction.

Real-World Examples and Common Applications


The normal distribution appears throughout nature and human activity whenever many small, independent factors combine to produce a measurement.

Common Applications


Measurement and Physical Sciences:
• Human heights, weights, and other biological measurements
• Measurement errors in scientific instruments
• IQ scores and standardized test results
• Blood pressure readings in healthy populations

Finance and Economics:
• Asset returns over short time periods (approximately normal)
• Pricing models for options and derivatives
• Portfolio risk analysis
• Economic indicators like inflation rates

Quality Control and Manufacturing:
• Product dimensions and tolerances
• Production process variations
• Six Sigma methodologies
• Control chart limits

Natural Phenomena:
• Temperature variations around seasonal averages
• Rainfall amounts in many regions
• Particle velocities in gases (Maxwell-Boltzmann distribution)

Why It Appears


The Central Limit Theorem explains why the normal distribution is ubiquitous: when many independent random effects add together, their sum tends toward normal regardless of the individual distributions. This makes it the natural model for aggregate phenomena.

Example Application


A factory produces bolts with target diameter 10 mm. Due to manufacturing variation, actual diameters follow N(10,0.12)N(10, 0.1^2). Quality standards require diameters between 9.8 mm and 10.2 mm.

Using the normal distribution, we can calculate that approximately 95.4% of bolts meet specifications, helping determine acceptable defect rates and production costs.

Interactive Calculator

Normal Distribution Calculator

Calculate probabilities and distribution properties

Center of the distribution

Spread of the distribution


Special Cases


The normal distribution exhibits several important special cases and limiting behaviors that connect it to other distributions and reveal its mathematical structure.

Standard Normal Distribution


When μ=0\mu = 0 and σ=1\sigma = 1, we obtain the standard normal distribution ZN(0,1)Z \sim N(0, 1):

f(z)=12πez22f(z) = \frac{1}{\sqrt{2\pi}} e^{-\frac{z^2}{2}}


Any normal random variable XN(μ,σ2)X \sim N(\mu, \sigma^2) can be standardized:

Z=XμσN(0,1)Z = \frac{X - \mu}{\sigma} \sim N(0, 1)


This transformation is the foundation for z-scores, hypothesis testing, and normal probability tables.

Degenerate Case


As σ0\sigma \to 0, the normal distribution converges to a point mass at μ\mu:

limσ0N(μ,σ2)=δμ\lim_{\sigma \to 0} N(\mu, \sigma^2) = \delta_{\mu}


All probability concentrates at a single value, and the distribution becomes deterministic. This represents the limiting case of no variability.

As σ Increases


As σ\sigma \to \infty, the distribution spreads out indefinitely. The density becomes flatter and approaches zero everywhere, though it never becomes truly uniform over the real line (total probability remains 1).

Limiting Behavior


Central Limit Theorem Connection:
The normal distribution emerges as the limit of many other distributions. For example:

Binomial B(n,p)B(n, p) approaches N(np,np(1p))N(np, np(1-p)) as nn \to \infty with fixed pp
• Sum of nn independent identically distributed variables approaches normal (under mild conditions)
• Sample means from any distribution approach normal as sample size grows

Sum of Normals:
If X1N(μ1,σ12)X_1 \sim N(\mu_1, \sigma_1^2) and X2N(μ2,σ22)X_2 \sim N(\mu_2, \sigma_2^2) are independent, then:

X1+X2N(μ1+μ2,σ12+σ22)X_1 + X_2 \sim N(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)


The normal distribution is closed under addition—sums of normal variables remain normal.

Practical Implications


Measurement Precision:
In metrology, as measurement precision improves (σ\sigma decreases), measurements cluster more tightly around the true value. The standard normal case (σ=1\sigma = 1) provides a natural reference scale.

Model Robustness:
Many statistical procedures assume normality. When nn is large (typically n>30n > 30), the Central Limit Theorem justifies this assumption even when underlying data aren't normal—this is why normal-based inference is so widely applicable.

Properties


The normal distribution possesses several distinctive mathematical properties that make it central to probability theory and statistics.

Symmetry


The normal distribution is perfectly symmetric around its mean μ\mu:

f(μ+x)=f(μx) for all xf(\mu + x) = f(\mu - x) \text{ for all } x


This symmetry implies:
Mean = Median = Mode = μ\mu
• Odd central moments equal zero: E[(Xμ)2k+1]=0E[(X - \mu)^{2k+1}] = 0
• The distribution is mirror-symmetric about the vertical line at x=μx = \mu

Skewness


Skewness=0\text{Skewness} = 0


The skewness coefficient measures asymmetry. Zero skewness confirms perfect symmetry—there is no tendency toward either tail.

Kurtosis


Kurtosis=3\text{Kurtosis} = 3


Excess Kurtosis=0\text{Excess Kurtosis} = 0


Kurtosis measures tail weight and peakedness. The normal distribution defines the baseline (kurtosis = 3), so excess kurtosis is zero by definition. Distributions with kurtosis > 3 have heavier tails than normal; kurtosis < 3 indicates lighter tails.

Tail Behavior


The normal distribution has exponentially decaying tails:

f(x)e(xμ)22σ2 as xμf(x) \sim e^{-\frac{(x-\mu)^2}{2\sigma^2}} \text{ as } |x - \mu| \to \infty


Probabilities in the tails decrease rapidly:
P(Xμ>2σ)0.05P(|X - \mu| > 2\sigma) \approx 0.05 (5%)
P(Xμ>3σ)0.003P(|X - \mu| > 3\sigma) \approx 0.003 (0.3%)
P(Xμ>4σ)0.00006P(|X - \mu| > 4\sigma) \approx 0.00006 (0.006%)

The tails are "thin"—extreme values are increasingly rare. This is lighter than polynomial tails but heavier than exponential cutoffs.

Unique Mathematical Properties


Closure Under Linear Transformation:
If XN(μ,σ2)X \sim N(\mu, \sigma^2), then for constants aa and bb:

aX+bN(aμ+b,a2σ2)aX + b \sim N(a\mu + b, a^2\sigma^2)


Closure Under Addition:
If X1N(μ1,σ12)X_1 \sim N(\mu_1, \sigma_1^2) and X2N(μ2,σ22)X_2 \sim N(\mu_2, \sigma_2^2) are independent:

X1+X2N(μ1+μ2,σ12+σ22)X_1 + X_2 \sim N(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)


Sums of independent normal variables remain normal—a rare property called stability or reproductive property.

Maximum Entropy:
Among all continuous distributions on (,)(-\infty, \infty) with specified mean μ\mu and variance σ2\sigma^2, the normal distribution has the highest entropy. It represents maximum uncertainty given only these two constraints.

Characteristic Function:
ϕ(t)=E[eitX]=eiμtσ2t22\phi(t) = E[e^{itX}] = e^{i\mu t - \frac{\sigma^2 t^2}{2}}


This simple exponential form makes analytical work tractable.

Useful Identities


68-95-99.7 Rule (Empirical Rule):
• Approximately 68% of values lie within μ±σ\mu \pm \sigma
• Approximately 95% of values lie within μ±2σ\mu \pm 2\sigma
• Approximately 99.7% of values lie within μ±3σ\mu \pm 3\sigma

Moment Generating Function:
M(t)=E[etX]=eμt+σ2t22M(t) = E[e^{tX}] = e^{\mu t + \frac{\sigma^2 t^2}{2}}


Linear Combinations:
For independent normal variables XiN(μi,σi2)X_i \sim N(\mu_i, \sigma_i^2) and constants aia_i:

i=1naiXiN(i=1naiμi,i=1nai2σi2)\sum_{i=1}^n a_i X_i \sim N\left(\sum_{i=1}^n a_i \mu_i, \sum_{i=1}^n a_i^2 \sigma_i^2\right)


Central Limit Theorem:
For i.i.d. random variables X1,X2,,XnX_1, X_2, \ldots, X_n with mean μ\mu and variance σ2\sigma^2:

Xˉnμσ/ndN(0,1) as n\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \xrightarrow{d} N(0, 1) \text{ as } n \to \infty


This convergence explains why the normal distribution appears so frequently in nature and statistics.