NZ Level 8 (NZC) Level 3 (NCEA) [In development]
topic badge
The exact distribution of the sample proportions
Lesson

 

We introduce the idea of the probability distribution of the sample proportions with an example.

Imagine a computer is programmed to randomly select four whole numbers between $1$1 and $10$10. The program is allowed to select any number more than once. Using combinatoric language we can assume that the selections are made with replacement.

A success is defined when a number drawn is prime (there are four primes available, namely $2,3,5,7$2,3,5,7). The six non-primes ($1,4,6,8,9,10$1,4,6,8,9,10) are deemed as failures.

So for example if the computer selects the set of numbers $2,2,6,9$2,2,6,9 then we record the selection as two successes. Likewise $1,6,9,10$1,6,9,10 would be recorded and $0$0 successes and $5,5,1,5$5,5,1,5 would be recorded as $3$3 successes. 

This program is an example of a Bernoulli process where exactly two outcomes are possible (either a success or a failure) and the probability of a success $p$p remains fixed for each of the four selections. The selection of each number is completely independent of the selection of any other number.

Since there are four primes between $1$1 and $10$10, the probability of a success is given by $p=\frac{4}{10}=0.4$p=410=0.4 and therefore the probability of a failure $q=1-p$q=1p is $0.6$0.6.

Using the binomial probability formula, the probability of $x$x successes in the four numbers selected becomes:

$P(x)=\binom{4}{x}(0.4)^x(0.6)^{4-x}$P(x)=(4x)(0.4)x(0.6)4x     $x=0,1,2,3,4$x=0,1,2,3,4

 

These probabilities can be listed in a table like this:

$x$x $P(x)$P(x)
$0$0 $0.1296$0.1296
$1$1 $0.3456$0.3456
$2$2 $0.3456$0.3456
$3$3 $0.1536$0.1536
$4$4 $0.0256$0.0256

 

 

From the table it is clear that the most likely outcome is either $1$1 or $2$2 successes (primes) in any set of four numbers selected.

Indeed the mean or expected value is determined as $4\times0.4=1.6$4×0.4=1.6 and the variance is given as $4\times0.4\times0.6=0.96$4×0.4×0.6=0.96.

 

The distribution of the sample proportion

We now imagine that the computer generates a sample of four numbers.

The proportion of successes $\hat{p}$^p in our sample can only be one of five possibilities. If there are no primes, then there are no successes, and our proportion becomes $\frac{0}{4}=0$04=0. If there is exactly one prime, the proportion becomes $\frac{1}{4}$14 and so on. The probability of these proportions are equivalent to the binomial probabilities shown in the above table.

We create a new table showing the binomial probabilities along side these proportions as follows:

 

$\hat{p}$^p $P(\hat{p})$P(^p)
$0$0 $0.1296$0.1296
$0.25$0.25 $0.3456$0.3456
$0.5$0.5 $0.3456$0.3456
$0.75$0.75 $0.1536$0.1536
$1$1 $0.0256$0.0256

 

This table is the exact probability distribution of the sample proportion $\hat{p}$^p for this particular computer program.

 

questions arising 

We can answer a number of probability questions about the distribution of sample proportions.

For example, referring to the computer program scenario above, we might ask what is the probability that $\hat{p}$^p is less than $0.6$0.6.

From the table, the answer is given by the sum $0.1296+0.3456+0.3456=0.8208$0.1296+0.3456+0.3456=0.8208.

As another example, to find $P(\hat{p}<0.8|\hat{p}\ge0.25)$P(^p<0.8|^p0.25) we use the conditional probability law as follows:

$P(\hat{p}<0.8|\hat{p}\ge0.25)$P(^p<0.8|^p0.25) $=$= $\frac{(\hat{p}<0.8)\cap(\hat{p}\ge0.25)}{\hat{p}\ge0.25}$(^p<0.8)(^p0.25)^p0.25
  $=$= $\frac{0.3456+0.3456+0.1536}{0.3456+0.3456+0.1536+0.0256}$0.3456+0.3456+0.15360.3456+0.3456+0.1536+0.0256
  $=$= $\frac{0.8448}{0.8704}$0.84480.8704
  $=$= $0.9706$0.9706
     

 

  

 

 


    

 

 

Worked Examples

QUESTION 1

A dog has three puppies.

Let $M$M represent the number of male puppies in this litter.

  1. If a dog has $3$3 puppies, then the number of male puppies, $M$M, can be $0$0, $1$1, $2$2 or $3$3.

    What are the values of the proportions, $\hat{P}$^P of male puppies in the litter associated with each outcome of $M$M?

    If $M=0$M=0: $\hat{P}$^P$=$=$\editable{}$

    If $M=1$M=1: $\hat{P}$^P$=$=$\editable{}$

    If $M=2$M=2: $\hat{P}$^P$=$=$\editable{}$

    If $M=3$M=3: $\hat{P}$^P$=$=$\editable{}$

  2. Construct the probability distribution for $M$M and $\hat{P}$^P below.

    $m$m $0$0 $1$1 $2$2 $3$3
    $P$P$($($M=m$M=m$)$) $\frac{1}{8}$18 $\editable{}$ $\editable{}$ $\editable{}$
    $\hat{p}$^p $0$0 $\frac{1}{3}$13 $\frac{2}{3}$23 $1$1
    $P$P$($($\hat{P}=\hat{p}$^P=^p$)$) $\editable{}$ $\frac{3}{8}$38 $\editable{}$ $\editable{}$
  3. Use your answers from part (b) to determine $P$P$($($\hat{P}>\frac{1}{2}$^P>12$)$).

QUESTION 2

In Western Australia it has been shown that $40%$40% of all voters are in favour of daylight saving. A sample of $5$5 voters are selected from Western Australia at random.

  1. What are the possible value of the sample proportion, $\hat{P}$^P, of individuals that are in favour of daylight saving in the sample?

    Write your answers from smallest to largest in the empty boxes below, simplifying where possible.

    $\hat{P}$^P $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
  2. Construct a probability distribution table which summarises the sample proportion of individuals from Western Australia who favoured daylight saving.

    Give your answers correct to four decimal places.

    $\hat{p}$^p $0$0 $\frac{1}{5}$15 $\frac{2}{5}$25 $\frac{3}{5}$35 $\frac{4}{5}$45 $1$1
    $P$P$($($\hat{P}=\hat{p}$^P=^p$)$) $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
  3. Determine $P$P$($($\hat{P}$^P$<$<$\frac{3}{5}$35$)$), using the results of part (b). Round your answer to the nearest four decimal places.

QUESTION 3

Two dice are rolled and the absolute value of the differences between the numbers appearing uppermost are recorded.

  1. Complete the table below that represents the sample space.

    Die $2$2
    1 2 3 4 5 6
    Die $1$1 1 $0$0 $\editable{}$ $\editable{}$ $3$3 $\editable{}$ $\editable{}$
    2 $1$1 $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
    3 $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $2$2 $\editable{}$
    4 $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
    5 $4$4 $\editable{}$ $2$2 $\editable{}$ $\editable{}$ $\editable{}$
    6 $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
  2. Let $X$X be defined as the absolute value of the difference between the two dice. Construct the probability distribution for $X$X using the table below.

    Enter the values of $x$x from left to right in ascending order, and simplify each probability.

    $x$x $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
    $P$P$($($X=x$X=x$)$) $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$ $\editable{}$
  3. What is the probability, $p$p, that $X>3$X>3?

  4. Two dice were rolled $3$3 times. Their absolute difference was recorded.

    Let $Y$Y be the number of times the absolute difference was greater than $3$3. Then $Y$Y can be $0$0, $1$1, $2$2 or $3$3.

    What is $\hat{P}$^P, the sample proportion of absolute differences greater than $3$3 associated with each outcome of $Y$Y?

    If $Y=0$Y=0: $\hat{P}$^P$=$=$\editable{}$

    If $Y=1$Y=1: $\hat{P}$^P$=$=$\editable{}$

    If $Y=2$Y=2: $\hat{P}$^P$=$=$\editable{}$

    If $Y=3$Y=3: $\hat{P}$^P$=$=$\editable{}$

  5. Construct the probability distribution for $Y$Y and $\hat{P}$^P below.

    Write each probability correct to four decimal places.

    $y$y $0$0 $1$1 $2$2 $3$3
    $P$P$($($Y=y$Y=y$)$) $\editable{}$ $\editable{}$ $0.0694$0.0694 $\editable{}$
    $\hat{p}$^p $0$0 $\frac{1}{3}$13 $\frac{2}{3}$23 $1$1
    $P$P$($($\hat{P}=\hat{p}$^P=^p$)$) $\editable{}$ $\editable{}$ $\editable{}$ $0.0046$0.0046
     
  6. Use the results of part (e) to determine $P$P$($($\hat{P}$^P$<$<$1$1$)$).

    Round your answer to four decimal places.

Outcomes

S8-2

Make inferences from surveys and experiments: A determining estimates and confidence intervals for means, proportions, and differences, recognising the relevance of the central limit theorem B using methods such as resampling or randomisation to assess

91582

Use statistical methods to make a formal inference

What is Mathspace

About Mathspace