topic badge
Standard level

7.07 Binomial distributions

Lesson

Binomial experiments

binomial experiment is where we have:

  • A fixed number of trials ($n$n).
  • Only two possible outcomes (for example, heads or tails).
  • One outcome is called a success ($p$p) and the other is a failure ($q$q).
  • The probability of each outcome is the same for each trial. 

The distinguishing feature of the binomial experiment is that we define one of the outcomes as the 'success' and we are interested in the probability of observing a number of successes in a set number of trials. Each possible number of successes together with their respective probability forms a binomial distribution.

 

Is it a binomial distribution?

Case 1

Let's suppose that thirteen cards are to be drawn randomly, one-by-one from a standard deck of $52$52 playing cards and not replaced in the deck. (There are $26$26 'red' cards and $26$26 'black' cards in the deck.) The number of black cards drawn in total could be any number from zero to thirteen and a probability is to be assigned to the occurrence of each possible total.

Can this experiment be modelled using a binomial random variable?

At the first trial, the probability of drawing a black card is $0.5$0.5 but as the experiment progresses the probability of drawing a black card can vary depending on which black cards have already been drawn. If the first $12$12 cards were all black, for example, then the probability that the thirteenth card is black is $\frac{14}{40}$1440 or $0.35$0.35.

We see that the trials are not independent, and rather completely dependent upon what has happened in previous card draws. We can therefore conclude this is NOT best modelled by a binomial random variable.

Case 2

Let's suppose that we have a multiple choice test to sit in a subject that we have not studied for, and will therefore need to guess each question on the paper, without being able to use any knowledge on the subject matter. Let's say the paper contains $12$12 questions, each with $5$5 possible answers.

Can the number of questions we guess correctly be modelled by a binomial random variable?

  • Firstly, consider guessing the answer to just one question. There are only two outcomes, either we guess correctly (a success) or we guess incorrectly (a failure).
  • Is answering each subsequent question on the test, a repeat of this first trial? Yes, because each question maintains the same probability of success ($0.2$0.2) and the same probability of failure ($0.8$0.8).
  • Is each trial of this experiment (in our case, guessing the answer to each question) independent of the others? Yes, how we answered one question has no influence on how we answer any of the others.

And therefore yes, this question can be modelled by a binomial distribution.

 

Defining a binomial random variable

In an experiment involving $n$n trials, there could be anywhere from $0$0 to $n$n successes. As $p$p is the long-run proportion of successes over many trials, if the $n$n trials were to be repeated many times, we would expect the number of successes on average, to be $np$np and this number is the mean of the binomial distribution.

The actual number of observed successes varies about this mean, giving rise to a variance $np\left(1-p\right)$np(1p) .

Suppose $k$k successes are observed, and $n-k$nk failures. We can count the number of ways this outcome can occur, namely $^nC_k$nCk or in equivalent notation, $\binom{n}{k}$(nk)

In fact, each of the probabilities for each possible number of successes, can be likened to the binomial expansion which we explored in an earlier lesson. 

If we have $n$n trials, each with a probability of success of $p$p and a probability of failure $q$q, we can look at the probability of each number of successes by examining the terms of the expansion $(p+q)^n$(p+q)n. If we take $n=4$n=4, $p=0.3$p=0.3 and $q=0.7$q=0.7, we can think of this as $\left(0.3+0.7\right)^4$(0.3+0.7)4.

The first term of the expansion aligns with the probability of $4$4 successes as follows: $\binom{4}{0}0.3^4\left(0.7\right)^0$(40)0.34(0.7)0

The second term of the expansion aligns with the probability of $3$3 successes as follows: $\binom{4}{1}0.3^3\left(0.7\right)^1$(41)0.33(0.7)1

The third term of the expansion aligns with the probability of $2$2 successes as follows: $\binom{4}{2}0.3^2\left(0.7\right)^2$(42)0.32(0.7)2

The fourth term of the expansion aligns with the probability of $1$1 success as follows: $\binom{4}{3}0.3^1\left(0.7\right)^3$(43)0.31(0.7)3

And the fifth term of the expansion aligns with the probability of $0$0 successes as follows: $\binom{4}{4}0.3^0\left(0.7\right)^4$(44)0.30(0.7)4

Notice that instead of having $\binom{4}{1}0.3^3\left(0.7\right)^1$(41)0.33(0.7)1 for $3$3 successes, we could have had $\binom{4}{3}0.3^3\left(0.7\right)^1$(43)0.33(0.7)1, since the nature of combinatorics and Pascal's triangle are symmetrical. And in fact this is exactly what we do write when thinking of defining a Binomial random variable, because it aligns with the meaning of the variables we introduce.

We can now calculate the probabilities associated with the outcomes of a binomial experiment. The probability of a particular instance of $k$k successes and $n-k$nk failures must be $p^k\left(1-p\right)^{n-k}$pk(1p)nk. But, because there are $^nC_k$nCk ways in which this outcome can occur, we conclude that

$P\left(X=k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k}$P(X=k)=(nk)pk(1p)nk

where $X$X is called a binomial random variable. It takes integer values from $0$0 to $n$n.

Key formulas

A binomial random variable $X$X, with $n$n trials and a probability of success of $p$p, can be defined as:

$P\left(X=k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k}$P(X=k)=(nk)pk(1p)nk

The mean or expected value of $X$X is calculated by:

$E(X)=np$E(X)=np

The variance of $X$X is calculated by:

$Var(X)=np(1-p)$Var(X)=np(1p)

And thus the standard deviation of $X$X is calculated by:

$StDev(X)=\sqrt{np(1-p)}$StDev(X)=np(1p)

 

Calculations with a binomial random variable

Once we have defined a situation as being best modelled by a binomial random variable, we can use the formula shown above to calculate various probabilities. We can also use technology such as our CAS calculator or graphics calculator to determine probabilities far more quickly.

 

Worked example

Suppose a binomial random variable $X$X is such that $n=14$n=14 and $p=0.4$p=0.4.

(a) Calculate $P(X=2)$P(X=2)

Think: Since we are finding the probability for exactly two successes, that is, an individual probability, we can either use the formula for a binomial random variable or we can use technology.

Do: Using the formula we have the following calculation:

$P\left(X=2\right)$P(X=2) $=$= $\binom{14}{2}0.4^2\left(0.6\right)^{14-2}$(142)0.42(0.6)142
  $\approx$ $0.0137$0.0137

To use technology instead, we will use the Binomial PDF function found in the statistics or command section of our calculator. PDF means Probability Density Function and indicates that you are calculating the probability of exactly $k$k successes.

(b) Calculate $P(X<2)$P(X<2)

Think: Now we are considering the probability of less than two successes. Less than two successes is either one success or no successes. We thus need to add the probabilities of $P(X=0)$P(X=0) and $P(X=1)$P(X=1). Again, we can do this using the formula or we can use technology.

Do: Using the formula we will have

$P\left(X=0\right)+P\left(X=1\right)$P(X=0)+P(X=1) $=$= $\binom{14}{0}0.4^0\left(0.6\right)^{14-0}+\binom{14}{1}0.4^1\left(0.6\right)^{14-1}$(140)0.40(0.6)140+(141)0.41(0.6)141
  $\approx$ $0.00810$0.00810

 

To use technology, we would use the Binomial CDF function, where CDF means Cumulative Distribution Function. A CDF calculates the sum of probabilities over a range of values.

 

Practice questions

Question 1

$X$X is a binomial variable with the probability mass function:

$P$P$($($X=k$X=k$)$)$=$=$\nCr{4}{k}\times\left(0.4\right)^k\times\left(0.6\right)^{4-k}$4Ck×(0.4)k×(0.6)4k for $k=0,1,2,3,4$k=0,1,2,3,4.

  1. What is the number of trials for this distribution?

  2. What is the probability of success?

  3. Which of the following is the probability $P\left(X=2\right)$P(X=2)?

    $\nCr{4}{2}\times\left(0.6\right)^2\times\left(0.4\right)^{4-2}$4C2×(0.6)2×(0.4)42

    A

    $\nCr{4}{2}\times\left(0.4\right)^2\times\left(0.6\right)^{4-2}$4C2×(0.4)2×(0.6)42

    B

    $\nCr{2}{4}\times\left(0.6\right)^4\times\left(0.4\right)^{4-2}$2C4×(0.6)4×(0.4)42

    C

    $\nCr{4}{2}\times\left(0.4\right)^2\times\left(0.4\right)^{4-2}$4C2×(0.4)2×(0.4)42

    D
  4. How many ways can we get $2$2 successes in the $4$4 trials?

  5. Calculate the probability $P\left(X=2\right)$P(X=2).

Question 2

$X$X is a binomial variable with the probability mass function:

$P$P$($($X=k$X=k$)$)$=$=$\nCr{5}{k}\times\left(0.2\right)^k\times\left(0.8\right)^{5-k}$5Ck×(0.2)k×(0.8)5k for $k=0,1,2,3,4,5$k=0,1,2,3,4,5.

  1. What is the number of trials for this distribution?

  2. What is the probability of success?

  3. Calculate the probability $P\left(X>3\right)$P(X>3).

  4. What is the most likely number of successes?

  5. What is the mean of the distribution?

  6. What is the variance of the distribution?

Question 3

$E\left(X\right)=4.8$E(X)=4.8 and $\sigma\left(X\right)=\sqrt{2.88}$σ(X)=2.88 for a binomial random variable $X$X.

  1. Find the probability of success $p$p.

  2. Find the number of trials $n$n.

What is Mathspace

About Mathspace