10. Samples and estimation

INVESTIGATION - Samples and simulation

10.02 Distribution of sample proportions

10.03 Approximate the distribution

10.04 Confidence intervals

10.05 Applications of confidence intervals

VCE 12 Methods 2023

10.05 Applications of confidence intervals

Lesson

Worksheet

Practice

Lesson

Before we launch into some applications of confidence intervals, let's recap some of the important calculations and interpretations from the previous section.

Calculating a confidence interval

A confidence interval is calculated in the following way:

$(\hat{p}-k\times\sqrt{\frac{\hat{p}\times(1-\hat{p})}{n}},\hat{p}+k\times\sqrt{\frac{\hat{p}\times(1-\hat{p})}{n}})$(^p−k×√^p×(1−^p)n,^p+k×√^p×(1−^p)n)

Where $\hat{p}$^p is the sample proportion taken from our particular sample, $n$n is the size of our sample, and $k$k is the $z$z-score associated with the level of confidence we wish to achieve.

Common confidence intervals and their associated $z$z-scores:

$90%$90% confidence interval has $k\approx1.645$k≈1.645
$95%$95% confidence interval has $k\approx1.960$k≈1.960
$99%$99% confidence interval has $k\approx2.576$k≈2.576

Calculating the margin of error

The margin of error is the distance from $\hat{p}$^p to either end of the confidence interval. This means we can calculate it in a number of ways.

If we have the confidence interval $(a,b)$(a,b) we can simply calculate the margin of error as $\frac{b-a}{2}$b−a2
If we have $\hat{p}$^p and the confidence interval $(a,b)$(a,b) we can calculate the margin of error as $b-\hat{p}$b−^p or $\hat{p}-a$^p−a
If we don't yet have the confidence interval we can use the portion of the confidence interval calculation that gives the margin of error which is $k\times\sqrt{\frac{\hat{p}\times(1-\hat{p})}{n}}$k×√^p×(1−^p)n where $k$k and $n$n are as noted above.

Interpreting a confidence interval and the margin of error

Recall that a confidence interval represents the level of confidence we have that the true population proportion, $p$p, will fall in the domain of the interval we have calculated.

The higher the level of confidence, the larger the margin of error we must tolerate, due to the "wider net" we have created to "catch" the true proportion $p$p.

The size of the margin of error is influenced by the level of confidence, the size of the sample and the value of the sample proportion.

Testing claims

As confidence intervals form an interval estimate of the population proportion, we can use them as a rudimentary tool for assessing claims about the population proportion.

Worked example

A coin is tossed $250$250 times and the coin landed heads $105$105 times.

(a) Find the sample proportion $\hat{p}$^p of the number of times the coin landed heads.

Think: $\hat{p}=\frac{x}{n}$^p=xn, where $x$x is the number of 'heads' and $n$n is the sample size.

Do:

$\hat{p}$^`p`	$=$=	$\frac{x}{n}$`xn`
	$=$=	$\frac{105}{250}$105250
	$=$=	$\frac{21}{50}$2150

(b) Create a $95%$95% confidence interval for the proportion of heads that we expect to appear when using this coin.

Think: Use the calculator with $x=105$x=105, $n=250$n=250 and confidence level $=5%$=5% to obtain the confidence interval for the population proportion.

Do:

$95%$95% confidence interval: $\left(0.359,0.481\right)$(0.359,0.481)

(c) Assess whether or not the coin is fair.

Think: Look to see if the expected population proportion of heads for a fair coin lies within the confidence interval.

Do: If the coin was fair the population proportion should be $0.5$0.5, however, this is not within our $95%$95% confidence interval. This could mean either we have an unusual sample, since $5%$5% of such intervals created from a sample will not contain the population proportion, or the coin is biased and is less likely to show heads than a fair coin.

We can state at a $95%$95% confidence level the coin does not appear to be fair.

(d) A second coin is tossed $250$250 times and the coin landed heads $127$127 times. Use a $95%$95% confidence interval to assess whether or not the coin is fair.

Think: Use the calculator to create the $95%$95% confidence interval and look to see if the expected proportion of $0.5$0.5 heads lies within the confidence interval.

Do: Using the calculator with $x=127$x=127, $n=250$n=250 and confidence level $95%$95%, we obtain:

$95%$95% Confidence interval: $\left(0.446,0.570\right)$(0.446,0.570)

If the coin was fair the population proportion should be $0.5$0.5 and this is within our $95%$95% confidence interval. However, the population proportion may be anywhere within this range, so we cannot be sure the coin is not in fact biased. Instead, we can say we have insufficient evidence to refute the coin is fair at a $95%$95% confidence level.

Reflect: At a given confidence level we can refute a claim if the asserted proportion does not lie within the confidence interval. However, we cannot accept a claim that a proportion is a certain value given the value lies within the confidence interval. We can simply state that there is insufficient evidence to refute the claim.

Practice questions

question 1

In a sample of $350$350 people, it is found that only $1$1 has blood type $B-negative$ .

Let $p$p represent the proportion of the population that have blood type $B-negative$ .

Find an estimate for $p$p.
Find an approximate two-sided $95%$95% confidence interval for $p$p.

Give your answer as an interval in the form $\left(a,b\right)$(a,b), rounding all values to four decimal places.
Select the most appropriate interpretation of the confidence interval found in part (b).
We are $95%$95% confident that the probability that a person has blood type $B-negative$ is contained within this interval.
A
The probability that a person has blood type $B-negative$ is not contained within this interval.
B
The probability that a person has blood type $B-negative$ is contained within this interval.
C
There is a $95%$95% chance that the probability that a person has blood type $B-negative$ is contained within this interval.
D
One measure of the validity of a confidence interval is that the product of the sample size $n$n and the population proportion $p$p is greater than $5$5.

Estimate this product for the blood type sample.
Given the result of part (d), select the most appropriate statement below.
Since $np<5$np<5 for our estimate, we cannot be sure that the sampling distribution is approximately normal and so the confidence interval is not valid.
A
Since $np>5$np>5 for our estimate, we know that the sampling distribution is approximately normal and so the confidence interval is valid.
B

question 2

question 3

The proportion of the population of the United States thought to have Celiac disease is $p$p. A sample of $2000$2000 Americans were surveyed for the disease and a confidence interval for the sample proportion was calculated as $\left(0.0089,0.0121\right)$(0.0089,0.0121).

How many people in this sample had the disease?
Use the margin of error to find the $z$z-score, $z$z, for this confidence interval.

Round your answer to three decimal places.
What is the level of confidence for this sample?

Give your answer as a percentage and round to the nearest percent.

Question 4

A chocolate company claims that $24%$24% of their chocolate drops are blue. $Quiana$ buys a packet to test the claim, and out of $210$210 candies $49$49 were blue.

State the sample proportion $\hat{p}$^p of the number of blue chocolate drops.
Construct an approximate two-sided $95%$95% confidence interval for the population proportion of blue chocolate drops.

Give your answer in the form $\left(a,b\right)$(a,b), rounding each endpoint to three decimal places.
Is there evidence to refute the claim?
No, because the confidence interval for the proportion of blue chocolate drops expected contains the claimed proportion.
A
No, because the confidence interval for the proportion of blue chocolate drops expected does not contain the claimed proportion.
B
Yes, because the confidence interval for the proportion of blue chocolate drops expected does not contain the claimed proportion.
C
Yes, because the confidence interval for the proportion of blue chocolate drops expected contains the claimed proportion.
D
$Quiana$ is not convinced and buys a larger bag. This time, out of $2140$2140 chocolate drops, $472$472 were blue. Does this sample offer evidence to refute the claim at a $95%$95% confidence level?
Yes, because the confidence interval for the proportion of blue chocolate drops expected contains the claimed proportion.
A
No, because the confidence interval for the proportion of blue chocolate drops expected does not contain the claimed proportion.
B
Yes, because the confidence interval for the proportion of blue chocolate drops expected does not contain the claimed proportion.
C
No, because the confidence interval for the proportion of blue chocolate drops expected contains the claimed proportion.
D

Outcomes

U34.AoS4.4

statistical inference, including definition and distribution of sample proportions, simulations and confidence intervals: - distinction between a population parameter and a sample statistic and the use of the sample statistic to estimate the population parameter - simulation of random sampling, for a variety of values of 𝑝 and a range of sample sizes, to illustrate the distribution of 𝑃^ and variations in confidence intervals between samples - concept of the sample proportion as a random variable whose value varies between samples, where 𝑋 is a binomial random variable which is associated with the number of items that have a particular characteristic and 𝑛 is the sample size - approximate normality of the distribution of P^ for large samples and, for such a situation, the mean 𝑝 (the population proportion) and standard deviation - determination and interpretation of, from a large sample, an approximate confidence interval for a population proportion where 𝑧 is the appropriate quantile for the standard normal distribution, in particular the 95% confidence interval as an example of such an interval where 𝑧 ≈ 1.96 (the term standard error may be used but is not required).

U34.AoS4.8

the concept of confidence intervals for proportions, variation in confidence intervals between samples and confidence intervals for estimates

U34.AoS4.12

simulate repeated random sampling and interpret the results, for a variety of population proportions and a range of sample sizes, to illustrate the distribution of sample proportions and variations in confidence intervals

U34.AoS4.13

calculate sample proportions and approximate confidence intervals for population proportions

10.05 Applications of confidence intervals

Testing claims

Worked example

Practice questions

question 1

question 2

question 3

Question 4

Outcomes

U34.AoS4.4

U34.AoS4.8

U34.AoS4.12

U34.AoS4.13

What is Mathspace

About Mathspace