NZ Level 8 (NZC) Level 3 (NCEA) [In development]

Applications of Bernoulli random variables and probabilities

Lesson

In other chapters, the mathematical form of the binomial distribution is explained together with the connection with Bernoulli trials. Other chapters that should be read in conjunction with this one are Problems with Binomial Distributions, Bernoulli Mean and Variance and Features of Binomial Distributions.

The title of this chapter refers to *Bernoulli random variables *but essentially, this is the same as considering problems to do with* binomial random variables. *We can justify this by the fact that a binomial random variable $X$`X` in an experiment with $n$`n` independent trials, is the sum of $n$`n` identical Bernoulli random variables, say, $X=B_1+B_2+...+B_n$`X`=`B`1+`B`2+...+`B``n`.

As we have seen, this way of looking at a binomial random variable makes it easy to develop expressions for the mean and variance of the relevant binomial distribution.

$\mu_X=E\left(X\right)=np$`μ``X`=`E`(`X`)=`n``p`

where $p$`p` is the probability of success on a Bernoulli trial, and

$Var\left[X\right]=np\left(1-p\right)$`V``a``r`[`X`]=`n``p`(1−`p`)

We also developed the formula for the probabilities associated with each possible number of successes in a sequence of $n$`n` Bernoulli trials.

$Pr\left(X=r\right)=\binom{n}{r}p^r\left(1-p\right)^{n-r}$`P``r`(`X`=`r`)=(`n``r`)`p``r`(1−`p`)`n`−`r`

When considering situations that might be modelled as binomial experiments, it is essential to identify the features of the situation that will constitute a *trial*, a *success *and the *sequence of trials*.

A six-sided die is to be tossed $21$21 times and the number of times neither a $3$3 nor a $4$4 occurs is to be recorded. If this is to be seen as a binomial experiment, what should we mean by a *trial*, a *success *and the *sequence of trials*? What is the probability of observing the expected value of the random variable representing the number of successes?

In this situation, a *trial *is a single throw of the die. A *success *is the non-occurrence of both a $3$3 and a $4$4. The experiment is the sequence of $21$21 tosses.

By a counting argument, we see that the probability of a success on a single trial is $\frac{2}{3}$23. So, the expected or mean number of successes over the whole experiment is $21\times\frac{2}{3}=14$21×23=14.

Then, the probability of observing this many successes is $\binom{21}{14}\times\left(\frac{2}{3}\right)^{14}\times\left(1-\frac{2}{3}\right)^{\left(21-14\right)}\approx0.182.$(2114)×(23)14×(1−23)(21−14)≈0.182.

A student with a flair for working with spreadsheets makes a programme in which a row of the spreadsheet has several adjacent cells that are individually coloured either red or green at random whenever the sheet is re-calculated. The formulas in the cells determine that each cell will be green with probability $0.25$0.25 and red otherwise.

The fourth row has four such adjacent cells. The fifth row has five. The sixth has six, and so on. The student is interested in the probability that there will be exactly four green cells in any given row when the sheet is re-calculated.

What are the probabilities of observing *four *green cells for rows four, five and six? What is the highest probability attainable of seeing exactly four green cells as further rows are added?

You should play with the following applet, which does the same thing as the student's spreadsheet. You could use this simulation to calculate relative frequencies over many performances of the recalculation, and compare your experimental results with the theoretical calculations provided.

Think about the event 'no cells are green'. In which row do you think this event will occur most often? Check your intuition with the help of the applet.

Now, think about the event 'exactly one cell is green'. In which row should this event occur most often?

You might continue in this way, considering the event 'exactly two green cells'.

The question below is about the event 'exactly four green cells'. (The applet will not answer the question for you in this case. So, it will be necessary to do some calculations.) Can you explain why the first three rows need not be considered in this $4$4-cell question?

_{(please be patient with the 50 trial button - we are working on making this faster!)}

Here, each row of the spreadsheet is a separate experiment. The coloured cells in a row are the Bernoulli trials for that experiment and a success occurs when a cell is coloured green. Letting the random variable $X_i$`X``i` be the number of successes in row $i$`i`, we calculate as follows:

$Pr\left(X_4=4\right)=\binom{4}{4}\times0.25^4\times0.75^0\approx0.0039$`P``r`(`X`4=4)=(44)×0.254×0.750≈0.0039

$Pr\left(X_5=4\right)=\binom{5}{4}\times0.25^4\times0.75^1\approx0.0146$`P``r`(`X`5=4)=(54)×0.254×0.751≈0.0146

$Pr\left(X_6=4\right)=\binom{6}{4}\times0.25^4\times0.75^2\approx0.0329$`P``r`(`X`6=4)=(64)×0.254×0.752≈0.0329

In this sequence of experiments, we see that the random variable $X_{16}$`X`16 has expected value $16\times0.25=4$16×0.25=4 and we might suspect that the probability of observing exactly four green cells increases up to $n=16$`n`=16 and decreases thereafter. Calculating as before, we have

$Pr\left(X_{14}=4\right)=\binom{14}{4}\times0.25^4\times0.75^{10}\approx0.2202$`P``r`(`X`14=4)=(144)×0.254×0.7510≈0.2202

$Pr\left(X_{15}=4\right)=\binom{15}{4}\times0.25^4\times0.75^{11}\approx0.2252$`P``r`(`X`15=4)=(154)×0.254×0.7511≈0.2252

$Pr\left(X_{16}=4\right)=\binom{16}{4}\times0.25^4\times0.75^{12}\approx0.2252$`P``r`(`X`16=4)=(164)×0.254×0.7512≈0.2252

$Pr\left(X_{17}=4\right)=\binom{17}{4}\times0.25^4\times0.75^{13}\approx0.2209$`P``r`(`X`17=4)=(174)×0.254×0.7513≈0.2209

$Pr\left(X_{18}=4\right)=\binom{18}{4}\times0.25^4\times0.75^{14}\approx0.2130$`P``r`(`X`18=4)=(184)×0.254×0.7514≈0.2130

Thus, the evidence is strong that the highest attainable probability of seeing exactly $4$4 green cells is approximately $0.2252$0.2252 and this occurs when $n=16$`n`=16.

Paul is completing a quiz that consists of $7$7 multiple-choice questions and that has a pass mark of $4$4. Each question has $5$5 possible answers, only one of which is correct. As he was too lazy to study for the quiz, Paul randomly guesses the answer to every question.

What is the probability that he achieves full marks?

What is the probability that he gets all of them wrong?

What is the probability that he passes the quiz?

Records show that half of all households in a city have broadband. What is the probability that less than $45%$45% of a sample of $100$100 random households have a broadband?

Give your answer as a decimal correct to four decimal places.

Yuri is playing a game in which he tosses a bunch of dice into the air and he wins if exactly $3$3 of the dice land on three. What is the fewest number of dice he must toss to ensure that the probability that he wins is more than $15%$15%?

Find the probability that Yuri will win if he tosses $n$

`n`dice.Hence find the fewest number of dice he must toss to ensure that the probability that he wins is more than $15%$15%.

Investigate situations that involve elements of chance: A calculating probabilities of independent, combined, and conditional events B calculating and interpreting expected values and standard deviations of discrete random variables C applying distributions such as the Poisson, binomial, and normal

Apply probability distributions in solving problems