Visual Tools
Calculators
Tables
Mathematical Keyboard
Converters
Other Tools


Hypergeometric Distribution Explorer


Modify Parameters and See Results

Sampling without replacement from a finite population

Parameters

Population Size (N)
50

Total number of items in the population

Success States (K)
20

Number of items with the desired characteristic in the population

Number of Draws (n)
10

Number of items drawn without replacement

Proportion of Successes (K/N)
0.400

Proportion of success items in the population

Statistics

Expected Value
4.0000
Variance
1.9592
Std Deviation
1.3997
Mode
4

Probability Calculator

Key Properties

Real-World Applications

  • Drawing cards from a deck without replacement (e.g., number of aces in 5 cards)
  • Quality control: selecting items from a batch without replacement
  • Lottery: choosing winning numbers from a finite set
  • Sampling defective items from a production lot
  • Drawing colored balls from an urn without replacement







Setting Population Parameters

Adjust N (population size) to define the total number of items in your finite population. The slider accommodates populations from 5 to 100, suitable for quality control batches and sampling scenarios.

Set K (success states) to specify how many items in the population have the desired characteristic. This must be ≤ N and defines the proportion K/N of successes available.

Configure n (number of draws) for your sample size. This must be ≤ N and determines how many items you're selecting without replacement from the population.

Understanding Parameter Dependencies

The three parameters create constraints that the sliders automatically enforce. As you increase N, both K and n can potentially increase. Decreasing N may force K or n to decrease to maintain K ≤ N and n ≤ N.

The proportion p = K/N plays a crucial role similar to the binomial's p parameter. When the population is homogeneous (p near 0 or 1), the distribution concentrates at extreme values. Balanced populations (p near 0.5) create more spread-out distributions.

Watch how the support (possible values) changes with parameters. The minimum possible successes is max(0, n-(N-K)) and maximum is min(n, K), creating a restricted range compared to binomial.

Interpreting the PMF Visualization

The PMF bars show P(X = k) only for feasible values in the support. You might notice the display doesn't start at 0 if max(0, n-(N-K)) > 0, reflecting the fact that some success counts are impossible given the parameters.

The distribution's shape depends on n and the ratio K/N. Larger samples (n close to N) create distributions more concentrated around the expected value nK/N. Smaller samples allow more variability.

Compare the hypergeometric shape to binomial with p = K/N. They're similar when N is large relative to n, but hypergeometric shows less spread due to the finite population correction.

Using the CDF Display

The CDF steps only at feasible values in the support. Unlike binomial which has support from 0 to n, hypergeometric CDF may start above 0 and end before n depending on your parameters.

The CDF represents P(X ≤ k), crucial for quality control questions: "What's the probability of finding k or fewer defects in my sample?" The curve's steepness around the mean indicates sampling precision.

Notice how the finite population correction affects CDF shape. When sampling a large fraction of the population (n/N is significant), the CDF rises more steeply, showing less variability than the equivalent binomial.

Calculating Exact Probabilities

Enter k in the Point Probability calculator to compute P(X = k) using: [C(K,k) × C(N-K, n-k)] / C(N,n). The three binomial coefficients account for all ways to select k successes and n-k failures.

The denominator C(N,n) counts all possible samples of size n from population N. The numerator counts samples with exactly k successes: choose k from K successes and n-k from N-K failures.

Try N = 52, K = 13, n = 5 (drawing 5 cards from a deck, counting spades). P(X = 2) ≈ 0.274 gives the probability of exactly 2 spades, accounting for sampling without replacement.

Computing Cumulative and Range Probabilities

Use P(X ≤ k) to find the probability of k or fewer successes in your sample. This sums point probabilities over the range [min support, k], providing cumulative mass up to k.

P(X ≥ k) gives the tail probability of k or more successes, useful for acceptance sampling: "If k or more items are defective, reject the lot."

Range calculations P(a ≤ X ≤ b) answer questions about intervals: "What's the probability of finding between 3 and 7 items with the characteristic?" Four boundary options handle inclusion/exclusion of endpoints.

What is the Hypergeometric Distribution?

The hypergeometric distribution models sampling without replacement from a finite population with two types of items (successes and failures). Each draw changes the population composition, creating dependence between draws.

Unlike the binomial distribution where draws are independent, hypergeometric accounts for depletion - each success drawn reduces remaining successes, altering probabilities for subsequent draws.

Applications include quality control sampling, card game probabilities, ecological sampling (capture-recapture), lottery odds, and poll accuracy analysis. For theoretical foundations and derivations, see hypergeometric distribution theory page.

Hypergeometric vs Binomial

Use hypergeometric for sampling without replacement from small populations. Use binomial for sampling with replacement or when the population is large enough that depletion is negligible.

When N ≥ 10n (population is at least 10 times the sample size), hypergeometric approximates binomial with p = K/N. The approximation improves as N/n increases.

The key difference: binomial assumes constant probability p across trials, while hypergeometric adjusts probabilities after each draw. This distinction matters most when sampling a substantial fraction of the population (n/N > 0.1).

Mean, Variance, and Finite Population Correction

The mean equals n(K/N) = np, matching the binomial mean. With N = 50, K = 20, n = 10, expect 4 successes on average.

The variance np(1-p)[(N-n)/(N-1)] includes the finite population correction factor (N-n)/(N-1). This is always ≤ 1, reducing variance below the binomial's np(1-p). The factor approaches 1 as N → ∞.

When sampling without replacement, you're dividing the population into your sample and the remainder. This "information" about the population reduces uncertainty, decreasing variance compared to independent sampling.

Related Distributions and Calculators

The binomial distribution is the limit as N → ∞ with K/N = p fixed. When population is large relative to sample, binomial provides a good approximation with simpler calculations.

In quality control, hypergeometric models acceptance sampling where you inspect n items from a lot of N, deciding acceptance based on defect count. Standards like MIL-STD-105 use hypergeometric principles.

Related Tools:

Binomial Distribution Calculator - Sampling with replacement / large populations

Fisher's Exact Test - Contingency table analysis using hypergeometric

Sampling Theory Calculator - Sampling distributions and confidence intervals

Quality Control Calculators - Acceptance sampling plans