Suppose a statistic is to be calculated from numerical data obtained from a sample drawn from a population. It could be the mean, it could be a number that gives a measure of the spread of the data, or it could be some other numerical result derived from the sample data.
It is clear that the same statistic calculated from several different samples drawn from a population could be different for each sample.
In the questions accompanying this chapter, you are asked to explore the variation that can occur for the mean in some simple cases, and also to see how the sample mean relates to the population mean.
For a population of a known size, we can calculate how many possible samples of a given size can be drawn from it. This is a problem in combinatorics. For moderate-sized populations and samples, the number of possible samples is usually very large.
If from a population of 500 it is desired to draw a sample of 5 units, there would be $\frac{500!}{495!5!}\approx2.55\times10^{11}$500!495!5!≈2.55×1011 possibilities.
It can be shown that the sample means are always clustered symmetrically around the population mean with many of them close to it and progressively fewer of them as the distance from the true mean increases.
If the sample size is increased, we find that the sample means are brought closer to the true mean. Techniques exist that enable us to choose a sample size that will be large enough to ensure that the sample mean is within a given distance of the true mean with a specified probability.
Numbers derived from a sample are routinely used for estimating the corresponding quantities in the population from which the sample was drawn. The sample mean, for example, is said to be an estimator for the population mean.
Sometimes a proportion in a sample can be used to estimate a proportion in the population and hence, the population size.
Membership of a club in a certain locality is open to anyone over the age of eighteen years living within the local geographic region. A survey of people eligible to be members finds that 2.125% of potential members actually are members. Club records show that there are currently 312 members.
From this information, it is possible to deduce approximately the number of people over the age of eighteen who live in the local area. We take the proportion of members in the sample to be an estimator for the proportion of members in the population. If the population size is $n$n, then we have $\frac{2.125}{100}=\frac{312}{n}$2.125100=312n. From this, we see that $n\approx14682$n≈14682.
This example should be compared with a similar one from ecology, presented in another chapter.
The heights (in cm) of a population of 3 people are $A$A, $B$B and $C$C.
List all possible samples of size 2 without replacement. For example if the first 2 are selected we can write that as $AB$AB.
Use commas to separate different samples.
If $A=171$A=171, $B=153$B=153 and $C=162$C=162, complete the following table:
Sample Values (cm) | Sample Mean |
---|---|
$171$171, $153$153 | $\editable{}$ cm |
$171$171, $162$162 | $\editable{}$ cm |
$153$153, $162$162 | $\editable{}$ cm |
What is the mean of the distribution of all possible sample means?
What is the population mean?
Is the mean of the sample means equal to the population mean?
Yes
No
The weights (in kg) of a population of 5 people are $F$F, $G$G, $H$H, $I$I and $J$J.
List all possible samples of size 1 without replacement. For example if the first weight is selected we can write that as $F$F.
Use commas to separate different samples.
If $F=98$F=98, $G=136$G=136, $H=116$H=116, $I=94$I=94, and $J=130$J=130 complete the following table.
Sample Values | Sample Mean |
---|---|
$98$98 | $\editable{}$ kg |
$136$136 | $\editable{}$ kg |
$116$116 | $\editable{}$ kg |
$94$94 | $\editable{}$ kg |
$130$130 | $\editable{}$ kg |
What is the mean of the distribution of all possible sample means?
Write your answer as a decimal.
What is the population mean?
Write your answer as a decimal.
Is the mean of the sample means equal to the population mean?
Yes
No
A survey asked $144$144 randomly chosen students if they were going to attend the school play. $18$18 said yes. If there are $204$204 tickets sold for the play, predict the number of students who attend the school, $x$x.