topic badge
CanadaON
Grade 12

The Hypergeometric Distribution

Lesson

The hypergeometric probability distribution is a discrete distribution that relates to sampling without replacement from a finite population.

Like the binomial distribution, it concerns outcomes that can be classed as either successes or failures.

Example 1

Consider a bucket containing $11$11 coloured beads. Five beads are blue and six are pink. Three beads are to be drawn at random from the bucket and we will count a trial as a success whenever a blue bead is drawn.

Since there are to be three trials in this experiment, the number of blue beads drawn can be $0,1,2$0,1,2 or $3$3 and we want to know the probability of each of these events. This set of probabilities is what is meant by the probability distribution. We would like to find a formula that will produce the required probability for each event when the numbers of blue and pink beads in the bucket are known and the number of trials is given.

Initially, the probability of selecting a blue bead is $\frac{5}{11}$511 but on the second trial, the probability of selecting a blue bead will be either $\frac{5}{10}$510 or $\frac{4}{10}$410 depending on the colour of the first bead. We see that successive trials in this experiment are not independent.

We could make a tree diagram showing the probabilities for each branch in the three-step process. The leaves show the probabilities for each combination of outcomes. These are found by multiplying the probabilities along the relevant branches. The tree might look something like the following.

Now, by adding the relevant leaf probabilities, we conclude that 
$P(0\ \text{blue beads})=\frac{4}{33}$P(0 blue beads)=433
$P(1\ \text{blue bead})=\frac{5+5+5}{33}=\frac{15}{33}=\frac{5}{11}$P(1 blue bead)=5+5+533=1533=511
$P(2\ \text{blue beads})=\frac{4+4+4}{33}=\frac{12}{33}=\frac{4}{11}$P(2 blue beads)=4+4+433=1233=411
$P(3\ \text{blue beads})=\frac{2}{33}$P(3 blue beads)=233

This set of probabilities constitutes the probability distribution for the experiment. It is clear that constructing the tree diagram would become a very complicated process if there were more than three steps. The response of the mathematician must be to look for a more elegant general procedure.

 

 

We can use a counting argument to solve the problem.

Suppose there is a population of $N$N things of which $k$k are considered successes. Let $n$n selections be made randomly from the population. The probability distribution is the set of probabilities $P(x)$P(x) where $x\in\left\{0,1,2,...,k\right\}$x{0,1,2,...,k} is the number of successes in the selection.

Now, the $n$n things selected are to include $x$x successes chosen from the $k$k available and $n-x$nx failures chosen from the $N-k$Nk available failures.

We use the binomial coefficient $\binom{n}{r}$(nr) to mean 'the number of ways of choosing $r$r things from $n.$n. It is calculated using factorials as $\binom{n}{r}=\frac{r!}{n!(n-r)!}$(nr)=r!n!(nr)!.

The number of ways of choosing $x$x successes and $n-x$nx failures must be $\binom{k}{x}\binom{N-k}{n-x}$(kx)(Nknx). But, the number of ways of choosing $n$n things out of the population of $N$N things is $\binom{N}{n}.$(Nn).

So, the probability of $x$x successes in a selection of $n$n things must be

$P(x)=\frac{\binom{k}{x}\binom{N-k}{n-x}}{\binom{N}{n}}$P(x)=(kx)(Nknx)(Nn).

 

Example 2

Test the validity of the formula against the probabilities derived in Example $1$1.

The population is $N=11$N=11
The number of available successes is $k=5$k=5
The number of trials is $n=3$n=3

We calculate

$P(0)$P(0) $=$= $\frac{\binom{5}{0}\binom{11-5}{3-0}}{\binom{11}{3}}$(50)(11530)(113)
  $=$= $\frac{\frac{5!}{0!(5-0)!}\frac{6!}{3!(6-3)!}}{\frac{11!}{3!(11-3)!}}$5!0!(50)!6!3!(63)!11!3!(113)!
  $=$= $\frac{1\times20}{165}$1×20165
  $=$= $\frac{4}{33}$433

This matches the previous result.

In a similar way we calculate

$P(1)=\frac{\binom{5}{1}\binom{11-5}{3-1}}{\binom{11}{3}}=\frac{5\times15}{165}=\frac{5}{11}$P(1)=(51)(11531)(113)=5×15165=511

$P(2)=\frac{\binom{5}{2}\binom{11-5}{3-2}}{\binom{11}{3}}=\frac{10\times6}{165}=\frac{4}{11}$P(2)=(52)(11532)(113)=10×6165=411

$P(3)=\frac{\binom{5}{3}\binom{11-5}{3-3}}{\binom{11}{3}}=\frac{10\times1}{165}=\frac{2}{33}$P(3)=(53)(11533)(113)=10×1165=233

 

On many calculators, the binomial coefficients are expressed in the form $^nC_r$nCr which may be read as '$n$n choose $r$r'.

 

 

 

 

 

Worked Examples

Question 1

Find $P\left(x\right)$P(x) using the probability density function for the hypergeometric distribution if $N=10$N=10, $k=6$k=6, $n=3$n=3 and $x=0$x=0.

Question 2

In a group of $11$11 people, $5$5 have blue eyes. If $4$4 people are randomly selected from the group, what is the probability that all of them have blue eyes?

Question 3

The board of directors of a particular company consists of $6$6 people who are randomly selected from a group of $8$8 female and $9$9 male candidates. Any board of directors selected must have at least one female on it in order to comply with gender equality laws in the country. What is the probability that a randomly selected board of directors will comply with gender equality laws?

Outcomes

12D.B.1.5

Recognize conditions (e.g., dependent trials) that give rise to a random variable that follows a hypergeometric probability distribution, calculate the probability associated with each value of the random variable, and represent the distribution numerically using a table and graphically using a probability histogram

12D.B.1.7

Solve problems involving probability distributions (e.g., uniform, binomial, hypergeometric), including problems arising from real-world applications

What is Mathspace

About Mathspace