A binomial experiment is where we have:
The distinguishing feature of the binomial experiment is that we define one of the outcomes as the 'success' and we are interested in the probability of observing a number of successes in a set number of trials. Each possible number of successes together with their respective probability forms a binomial distribution.
Let's suppose that thirteen cards are to be drawn randomly, one-by-one from a standard deck of $52$52 playing cards and not replaced in the deck. (There are $26$26 'red' cards and $26$26 'black' cards in the deck.) The number of black cards drawn in total could be any number from zero to thirteen and a probability is to be assigned to the occurrence of each possible total.
Can this experiment be modelled using a binomial random variable?
At the first trial, the probability of drawing a black card is $0.5$0.5 but as the experiment progresses the probability of drawing a black card can vary depending on which black cards have already been drawn. If the first $12$12 cards were all black, for example, then the probability that the thirteenth card is black is $\frac{14}{40}$1440 or $0.35$0.35.
We see that the trials are not independent, and rather completely dependent upon what has happened in previous card draws. We can therefore conclude this is NOT best modelled by a binomial random variable.
Let's suppose that we have a multiple choice test to sit in a subject that we have not studied for, and will therefore need to guess each question on the paper, without being able to use any knowledge on the subject matter. Let's say the paper contains $12$12 questions, each with $5$5 possible answers.
Can the number of questions we guess correctly be modelled by a binomial random variable?
And therefore yes, this question can be modelled by a binomial distribution.
In an experiment involving $n$n trials, there could be anywhere from $0$0 to $n$n successes. As $p$p is the long-run proportion of successes over many trials, if the $n$n trials were to be repeated many times, we would expect the number of successes on average, to be $np$np and this number is the mean of the binomial distribution.
The actual number of observed successes varies about this mean, giving rise to a variance $np\left(1-p\right)$np(1−p) .
Suppose $k$k successes are observed, and $n-k$n−k failures. We can count the number of ways this outcome can occur, namely $^nC_k$nCk or in equivalent notation, $\binom{n}{k}$(nk).
In fact, each of the probabilities for each possible number of successes, can be likened to the binomial expansion which we explored in an earlier lesson.
If we have $n$n trials, each with a probability of success of $p$p and a probability of failure $q$q, we can look at the probability of each number of successes by examining the terms of the expansion $(p+q)^n$(p+q)n. If we take $n=4$n=4, $p=0.3$p=0.3 and $q=0.7$q=0.7, we can think of this as $\left(0.3+0.7\right)^4$(0.3+0.7)4.
The first term of the expansion aligns with the probability of $4$4 successes as follows: $\binom{4}{0}0.3^4\left(0.7\right)^0$(40)0.34(0.7)0
The second term of the expansion aligns with the probability of $3$3 successes as follows: $\binom{4}{1}0.3^3\left(0.7\right)^1$(41)0.33(0.7)1
The third term of the expansion aligns with the probability of $2$2 successes as follows: $\binom{4}{2}0.3^2\left(0.7\right)^2$(42)0.32(0.7)2
The fourth term of the expansion aligns with the probability of $1$1 success as follows: $\binom{4}{3}0.3^1\left(0.7\right)^3$(43)0.31(0.7)3
And the fifth term of the expansion aligns with the probability of $0$0 successes as follows: $\binom{4}{4}0.3^0\left(0.7\right)^4$(44)0.30(0.7)4
Notice that instead of having $\binom{4}{1}0.3^3\left(0.7\right)^1$(41)0.33(0.7)1 for $3$3 successes, we could have had $\binom{4}{3}0.3^3\left(0.7\right)^1$(43)0.33(0.7)1, since the nature of combinatorics and Pascal's triangle are symmetrical. And in fact this is exactly what we do write when thinking of defining a Binomial random variable, because it aligns with the meaning of the variables we introduce.
We can now calculate the probabilities associated with the outcomes of a binomial experiment. The probability of a particular instance of $k$k successes and $n-k$n−k failures must be $p^k\left(1-p\right)^{n-k}$pk(1−p)n−k. But, because there are $^nC_k$nCk ways in which this outcome can occur, we conclude that
$P\left(X=k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k}$P(X=k)=(nk)pk(1−p)n−k
where $X$X is called a binomial random variable. It takes integer values from $0$0 to $n$n.
A binomial random variable $X$X, with $n$n trials and a probability of success of $p$p, can be defined as:
$P\left(X=k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k}$P(X=k)=(nk)pk(1−p)n−k
The mean or expected value of $X$X is calculated by:
$E(X)=np$E(X)=np
The variance of $X$X is calculated by:
$Var(X)=np(1-p)$Var(X)=np(1−p)
And thus the standard deviation of $X$X is calculated by:
$StDev(X)=\sqrt{np(1-p)}$StDev(X)=√np(1−p)
Once we have defined a situation as being best modelled by a binomial random variable, we can use the formula shown above to calculate various probabilities. We can also use technology such as our CAS calculator or graphics calculator to determine probabilities far more quickly.
Suppose a binomial random variable $X$X is such that $n=14$n=14 and $p=0.4$p=0.4.
(a) Calculate $P(X=2)$P(X=2)
Think: Since we are finding the probability for exactly two successes, that is, an individual probability, we can either use the formula for a binomial random variable or we can use technology.
Do: Using the formula we have the following calculation:
$P\left(X=2\right)$P(X=2) | $=$= | $\binom{14}{2}0.4^2\left(0.6\right)^{14-2}$(142)0.42(0.6)14−2 |
$\approx$≈ | $0.0137$0.0137 |
To use technology instead, we will use the Binomial PDF function found in the statistics or command section of our calculator. PDF means Probability Density Function and indicates that you are calculating the probability of exactly $k$k successes.
(b) Calculate $P(X<2)$P(X<2)
Think: Now we are considering the probability of less than two successes. Less than two successes is either one success or no successes. We thus need to add the probabilities of $P(X=0)$P(X=0) and $P(X=1)$P(X=1). Again, we can do this using the formula or we can use technology.
Do: Using the formula we will have
$P\left(X=0\right)+P\left(X=1\right)$P(X=0)+P(X=1) | $=$= | $\binom{14}{0}0.4^0\left(0.6\right)^{14-0}+\binom{14}{1}0.4^1\left(0.6\right)^{14-1}$(140)0.40(0.6)14−0+(141)0.41(0.6)14−1 |
$\approx$≈ | $0.00810$0.00810 |
To use technology, we would use the Binomial CDF function, where CDF means Cumulative Distribution Function. A CDF calculates the sum of probabilities over a range of values.
$X$X is a binomial variable with the probability mass function:
$P$P$($($X=k$X=k$)$)$=$=$\nCr{4}{k}\times\left(0.4\right)^k\times\left(0.6\right)^{4-k}$4Ck×(0.4)k×(0.6)4−k for $k=0,1,2,3,4$k=0,1,2,3,4.
What is the number of trials for this distribution?
What is the probability of success?
Which of the following is the probability $P\left(X=2\right)$P(X=2)?
$\nCr{4}{2}\times\left(0.6\right)^2\times\left(0.4\right)^{4-2}$4C2×(0.6)2×(0.4)4−2
$\nCr{4}{2}\times\left(0.4\right)^2\times\left(0.6\right)^{4-2}$4C2×(0.4)2×(0.6)4−2
$\nCr{2}{4}\times\left(0.6\right)^4\times\left(0.4\right)^{4-2}$2C4×(0.6)4×(0.4)4−2
$\nCr{4}{2}\times\left(0.4\right)^2\times\left(0.4\right)^{4-2}$4C2×(0.4)2×(0.4)4−2
How many ways can we get $2$2 successes in the $4$4 trials?
Calculate the probability $P\left(X=2\right)$P(X=2).
$X$X is a binomial variable with the probability mass function:
$P$P$($($X=k$X=k$)$)$=$=$\nCr{5}{k}\times\left(0.2\right)^k\times\left(0.8\right)^{5-k}$5Ck×(0.2)k×(0.8)5−k for $k=0,1,2,3,4,5$k=0,1,2,3,4,5.
What is the number of trials for this distribution?
What is the probability of success?
Calculate the probability $P\left(X>3\right)$P(X>3).
What is the most likely number of successes?
What is the mean of the distribution?
What is the variance of the distribution?
$E\left(X\right)=4.8$E(X)=4.8 and $\sigma\left(X\right)=\sqrt{2.88}$σ(X)=√2.88 for a binomial random variable $X$X.
Find the probability of success $p$p.
Find the number of trials $n$n.