topic badge

16.02 Binomial distributions

Lesson

Bernoulli distribution

When an experiment can have either of two possible outcomes, usually called success and failure, it gives rise to a Bernoulli random variable. We assign the values $X=1$X=1 and $X=0$X=0 to a Bernoulli random variable $X$X according to whether a trial of the experiment results in a success or a failure. Also, we assign probabilities $p$p and $q$q to the two outcomes.

Thus, we write $P\left(X=1\right)=p$P(X=1)=p and $P\left(X=0\right)=q=1-p$P(X=0)=q=1p

The expected value or mean of the  Bernoulli random variable $X$X may be thought of informally as the average amount of 'success' per trial over a very large number of trials. This is just $p$p and we write  $\mu_X=p$μX=p or $E(X)=p$E(X)=p

If we experiment and calculate the amount of 'success' per trial over just a few trials, we will quite likely obtain a value different from $p$p. By doing this repeatedly, we obtain a spread of values centered around the mean, $p$p. This spread of values is what is meant by the variance of the random variable $X$X. Using the definition of variance, we write $Var(X)=E(X-\mu_X)^2$Var(X)=E(XμX)2 and evaluate this from the definition as 

$Var(X)=p(1-\mu_x)^2+q(0-\mu_x)^2=$Var(X)=p(1μx)2+q(0μx)2=$pq^2+qp^2$pq2+qp2$=pq(p+q)$=pq(p+q)$=pq$=pq

Thus, a Bernoulli random variable has mean $\mu_X=p$μX=p and variance $Var\left(X\right)=p\left(1-p\right)$Var(X)=p(1p)

 

Binomial Distribution

We are often interested in strings of independent Bernoulli trials. The distinguishing feature of the Binomial distribution is that we are interested in the probability of observing each possible number of successes in a string of Bernoulli trials. 

In an experiment involving $n$n trials, there could be anywhere from $0$0 to $n$n successes. As $p$p is the long-run proportion of successes over many trials, if the $n$n trials were to be repeated many times, we would expect the number of successes on average, to be $np$np and this number is the mean of the binomial distribution.

The actual number of observed successes varies about this mean, giving rise to a variance $np\left(1-p\right)$np(1p) which you should compare with the variance for the Bernoulli distribution.

Suppose $r$r successes are observed, and $n-r$nr failures. We can count the number of ways this outcome can occur, namely $^nC_r$nCr or in equivalent notation, $\binom{n}{r}$(nr).  From the theory of combinatorics, we know that this is evaluated by $^nC_r=\frac{n!}{r!\left(1-r\right)!}$nCr=n!r!(1r)!

The numbers $^nC_r$nCr are the same as the coefficients that arise in the distribution of the binomial expression $\left(a+b\right)^n$(a+b)n. Hence, the name binomial distribution. 

We can now calculate the probabilities associated with the outcomes of a binomial experiment. The probability of a particular instance of $r$r successes and $n-r$nr failures must be $p^r\left(1-p\right)^{n-r}$pr(1p)nr. But, because there are $^nC_r$nCr ways in which this outcome can occur, we conclude that

$P\left(N=r\right)=\binom{n}{r}p^r\left(1-p\right)^{n-r}$P(N=r)=(nr)pr(1p)nr

where $N$N is called a binomial random variable. It takes integer values from $0$0 to $n$n.

 

Although it may not be strictly true, we assume for the sake of this example that the occurrence of rain on a given day over a thirty-day period is independent of the weather on the preceding and following days. Suppose that according to historical records the probability of rain on any day in April is $0.2$0.2

The mean number of rainy days in April is $np=30\times0.2=6$np=30×0.2=6. However, in the most recent month of April, there were $10$10 rainy days. The variance is $np\left(1-p\right)=30\times0.2\times0.8=4.8$np(1p)=30×0.2×0.8=4.8 and we might wonder how unlikely it is to get a number of rainy days this far or further away from the mean.

The probability of getting exactly the mean number of rainy days is $\binom{30}{6}\times0.2^6\times0.8^24=0.179$(306)×0.26×0.824=0.179 to three decimal places.

The probability of getting exactly ten days of rain is $\binom{30}{10}\times0.2^{10}\times0.8^{20}=0.035$(3010)×0.210×0.820=0.035 to three decimal places.

We could calculate the probability of observing at least $10$10 days of rain by first calculating the probabilities of exactly $0$0, $1$1, $2$2, $3$3, $4$4, $5$5,$6$6, $7$7, $8$8, and $9$9 days of rain. The sum of these is the probability of seeing fewer than $10$10 days of rain and the number we want is one minus this amount.

Working through this calculation we can check that the probability of observing $10$10 or more rainy days would be $0.061$0.061 to three decimal places. So, the observed event is not easily explained as a random fluctuation.

Practice questions

QUESTION 1

Find the value of $\nCr{5}{4}\times\left(0.1\right)^4\times0.9+\nCr{5}{5}\times\left(0.1\right)^5\times\left(0.9\right)^0$5C4×(0.1)4×0.9+5C5×(0.1)5×(0.9)0.

QUESTION 2

Census data show that $80%$80% of the population in a particular country have brown eyes.

A random sample of $900$900 people is selected from the population.

  1. What is the mean number of people in the sample who have brown eyes?

  2. What is the standard deviation of the number of people in the sample who have brown eyes?

Now that we know a little about Bernoulli Trials and the Binomial Distribution, let's take a more visual look at the distribution and making the most of using a graphing calculator.

The graph of the binomial distribution

We'll begin by interacting with the applet below to get a feel for how different values of $n$n and $p$p affect the distribution of our probabilities for a Binomial distribution.

Remember!

$n$n is the number of trials of a Bernoulli experiment (an experiment with only two outcomes, a success or a failure)

$p$p is the probability of success of each trial and each trial is independent. 

  1. Begin by setting the applet to $n=10$n=10 and $p=0.5$p=0.5.
  2. How would you describe the distribution of the graph you see? (Remember when describing the shape of a histogram we use the phrases positively skewed, symmetrical and negatively skewed.)
  3. Keeping $p=0.5$p=0.5, change the value of $n$n. Does your description of the distribution stay the same? (In both cases, with $p=0.5$p=0.5, we should see that the graphs are symmetrical. This makes a lot of sense! A value of $p=0.5$p=0.5 indicates an equal probability of success and failure, so you'd expect symmetry.)
  4. Now set the applet to $n=10$n=10 and and slide the $p$p value to the left and to the right.

Guided questions

  1. As the probability of success decreases, what happens to the shape of the distribution?
  2. As the probability of success increases, what happens to the shape of the distribution?

Answers

  1. The distribution becomes positively skewed with a tail to the right
  2. The distribution becomes negatively skewed with a tail to the left

Now slide the $n$n and $p$p values around and confirm that those findings about the shape of the distribution hold for all scenarios.

 

Graphing the Binomial Distribution Using Your CAS Calculator

Let's look through a series of screenshots to do these problems using the TI-Nspire. You can use your own graphing calculator or an online graphing calculator.

First, through the menu we select Statistics and Distribution and select BinomialPdf to calculate $P(X=5)$P(X=5) when $n=8$n=8 and $p=0.3$p=0.3

To calculate the cumulative probability of $P(1<=X<=3)$P(1<=X<=3) we select BinomialCdf instead.

 

Problems with binomial distributions

To answer questions involving the binomial probability distribution, we need the technical information about the distribution that was presented in an earlier chapter.

Situations that may be modeled by a binomial distribution are those in which there are a number of independent trials of the same experiment and the individual outcomes are either success or failure. We are interested in the probability that there will be some number $r$r successes among the $n$n trials of the experiment.

For example, the 'experiment' might be a survey in which the respondents can answer either yes or no. Or, it might be it might be a chicken-hatching experiment with a batch of $n$n eggs in which individuals eventually either hatch or fail to hatch.

 

Worked example

Question 3

Continuing with the chicken-hatching example, we might know from previous observations that the probability of an egg hatching is $p$p. We would expect $np$np chickens to hatch and would not be surprised to see some variation around this mean value. The variance, $np\left(1-p\right)$np(1p) and the standard deviation, which is the square root of the variance, gives an indication of how much variability to expect. 

If a particular batch of eggs had a hatch rate that was more than, say, one standard deviation away from the mean, we might wonder whether something unusual had happened with the process.

Suppose the usual hatch rate is $60%$60% and a batch of $40$40 eggs is being incubated. The mean or expected number of chicks would be $40\times0.6=24$40×0.6=24.

The standard deviation in this case is $\sqrt{40\times0.6\times0.4}\approx3.1$40×0.6×0.43.1. We might be alarmed if fewer than $21$21 of the eggs hatched.

The probability of hatching exactly $20$20 chicks can be found using the binomial probability formula 

$P(N=r)=\binom{n}{r}p^r\left(1-p\right)^{n-r}$P(N=r)=(nr)pr(1p)nr

In this case, we have $P(N=20)=\binom{40}{20}\times0.6^{20}\times\left(1-0.6\right)^{40-20}$P(N=20)=(4020)×0.620×(10.6)4020 and this simplifies to the probability $0.055$0.055. We could calculate in the same way all the probabilities from zero up to $20$20 and add these up to find the probability of obtaining fewer than $21$21 chicks. (This is quite tedious and is best done by machine. Use of a spreadsheet application is one solution.) 

In this way, we find that the probability of $20$20 or fewer chicks hatching is $0.13$0.13 while the probability of between $21$21 and $27$27 hatching (that is, a number within a one-standard-deviation range about the mean) is $0.74$0.74.

Practice questions

Question 3

Which of the following statements is true?

  1. In a binomial experiment, there are $n$n repeated, dependent trials.

    A

    In a binomial experiment, there are $n$n repeated, independent trials.

    B

    In a binomial experiment, there are exactly two trials.

    C

    In a binomial experiment, there is exactly one trial.

    D

Question 4

A particular nationwide numeracy test has a failure rate of $30%$30%.

If you randomly selected $100$100 students from across the country to do the test, how many would you expect to pass?

Question 5

A multiple choice test has $15$15 questions in total. Each question has $6$6 options, only one of which is correct.

A student has not studied at all for the test and will have to guess all the answers.

Calculate each of the following probabilities, rounding your answers to four decimal places:

  1. Determine the probability the student answers exactly $4$4 questions correctly.

  2. Determine the probability the student answers at most $4$4 questions correctly.

  3. Determine the probability the student answers between $2$2 and $6$6 questions inclusive correctly.

  4. Given that the student answered at least $2$2 questions correctly, what is the probability they answered less than $6$6 correctly?

 

Features of binomial distributions

When some proportion $\theta$θ of a population is observed to have a particular characteristic, we tend to use our relative frequency understanding of probability to conclude that the probability of a randomly chosen individual in the population having the characteristic is $\theta$θ

In an experiment or observational study that we believe it appropriate to model with a binomial probability distribution, we may already know from previous work that the probability of a success on each trial is $\theta$θ and we may have planned to perform $n$n trials in the experiment. It is natural to interpret the probability $\theta$θ in the relative frequency sense, as a proportion, and to conclude that the expected number of successes over the $n$n trials is $n\theta$nθ

This is indeed the expected value or mean of the distribution, which is also presented in the chapter on the binomial distribution and derived in a different way in the chapter on the Bernoulli distribution.

If we let $X$X be a random variable that represents the number of successes in an experiment in which there are $n$n independent trials or observations, then for the expected or average value of $X$X we write

$E\left(X\right)=n\theta$E(X)=nθ

where $\theta$θ is the probability of success on each trial. 

Caution is needed, however, in proceeding from a general observation of a relative frequency to a conclusion about the probabilities of the various possible numbers of successes in an experiment, as the following example illustrates. The binomial distribution is not always applicable.

Worked example

Question 6

Suppose it has been observed that each year in the state of Victoria, $23%$23% of people in the $20$20 to $55$55age range experience a mild to serious attack of the sniffles in the month of July.

Using the relative frequency idea of probability, we could argue that on average an individual in the population has a probability of $0.23$0.23 of contracting the sniffles.

If we let $Y$Y be the random variable whose values are the possible numbers of people who get the sniffles, we would be justified in saying that $E\left(Y\right)=0.23n$E(Y)=0.23n where $n$n is the number of people in the population.

However, we would not be justified in using the proportion $0.23$0.23 as the parameter in a binomial distribution, to calculate the probability that $E\left(Y\right)$E(Y) cases would actually occur or that any other particular number of cases would occur. 

The binomial model is unreliable when the trials are not independent. In this case, some members of the population would be at greater risk of contracting the sniffles than others due to possible contagion in work and family environments, pre-existing health conditions or exposure capacity. That is, the probability is not the same for every observation.

Thus, in a subset of, say, $50$50 people, it would be risky to predict $50\times0.23=11.5$50×0.23=11.5 cases even though this is the number that would be expected as a long-term average.  In the case of epidemics, a probability distribution with a different shape may be more appropriate.

Shape

The shape of a binomial distribution depends on the value of the probability parameter. In all cases, the expected value of the distribution has the greatest probability. The following diagrams show the probabilities of from $0$0 to $25$25 successes in $25$25 independent trials, with three different values of the probability parameter.

The first graph, with parameter $0.15$0.15, is said to be skewed to the right; the second graph, with parameter $0.5$0.5, is symmetrical about the mean; and the third graph, with the largest probability parameter, is skewed to the left.

 

Practice questions

Question 7

A certain disease has a survival rate of $64%$64%. Of the next $110$110 people who contract the disease, how many would you expect to survive? Round your answer to the nearest whole number.

Question 8

A subject exam consists of $48$48 multiple choice questions. Each question has $4$4 options, of which $1$1 is correct. Neville guessed the answers to all of the questions.

How many questions would he expect to get correct?

What is Mathspace

About Mathspace