In statistical experiments and surveys, we need to be precise in identifying the group of subjects that are being studied.
When the Australian Bureau of Statistics conducts a census, every person in the country is part of the study and the statistical term for this large group of people is a population.
In another kind of study, researchers may wish to gather information about every koala living in a particular region in the south-east of the country. The population in this study would be all the koalas living in the region.
In practice, it can be difficult and expensive to gather information about every subject in a population. For this reason, researchers often choose to restrict a study to a sample drawn from the population. If the sample is selected carefully, it can provide good information about the whole population without the expense of conducting a census.
A piece of numerical information about a population is referred to as a population parameter. For example, the average weight of koalas living in a particular region is a population parameter.
The same information obtained from a sample from the population is called an estimator as it is used to estimate the population parameter. The average weight of a random sample of koalas drawn from the population is an estimator for the average weight of koalas in the population as a whole.
In statistical work, certain symbols are used, by convention, for some population parameters and their corresponding estimators.
As a rule, the population parameters are not known but are assumed to exist. The estimators are numbers called statistics. These are calculated from measurements taken on the sample.
The mean of a set of numerical results for a population is given the symbol $\mu$μ. The population mean, $\mu$μ, is usually not known precisely unless it is found through a census. However, it is estimated from a sample of $n$n measurements $x_1,x_2,x_3,...,x_n$x1,x2,x3,...,xn by taking the average. This estimator is given the symbol $\overline{x}$x, read as '$x$x-bar'.
In symbols, we write $\overline{x}=\frac{1}{n}\sum_{i=1}^nx_i$x=1nn∑i=1xi.
The standard deviation of a population is given the symbol $\sigma$σ (sigma). The estimator for $\sigma$σ is the sample standard deviation, signified by the letter $s$s.
To calculate the statistic $s$s, we use $s=\sqrt{\frac{1}{n-1}\sum_{i=1}^n\left(x_i-\overline{x}\right)^2}$s=√1n−1n∑i=1(xi−x)2. Notice that to get the estimator $s$s for $\sigma$σ, we usually need the estimator $\overline{x}$x for $\mu$μ as part of the calculation.
The students belonging to a school are defined to be the population for a particular study. The students belonging to one class from the school might then be considered to be a sample from the population.
It is known from the school's enrollment records that the average age of the student population at the school is $15.25$15.25 years. However, the average age of the students in the sample class is $16.5$16.5 years.
In this case, we would say that the sample statistic was not a good estimator for the population parameter.
Similarly, the age range in the sample is likely to be not much more than $18$18 months, which is not a good estimator for the age range of the school population, which could be something like $6$6 years.
This example suggests that care is needed when using samples to estimate population parameters. The sample class did not represent the school population in an unbiased way.
For a statistical survey the population is deemed to be all the students that attend the local high school.
Which three of the following are samples of that population?
All the seniors.
The first $30$30 students to arrive at school that day.
All the students that were in the library at lunch time.
All the teachers.
Some of the teachers.
For a statistical survey the population is deemed to be all people in a city who play in any organised sporting competition.
Which three of the following are samples of that population?
$500$500 spectators chosen from a weekend sports match.
The members of $3$3 teams chosen from the local hockey tournament.
$100$100 people chosen at random from a local park.
All students from a local school who compete in a school sports competition.
All the active members of a local football club.
A school held a fundraiser to raise money for their annual ski trip, and on average each student in the school raised $\$228$$228.
If one class in the school raised an average of $\$189$$189 per student, which of the following is possible?
$\overline{x}$x$=$=$189$189, $\mu$μ$=$=$228$228, $s=5.8$s=5.8, $\sigma$σ$=$=$4$4
$\overline{x}$x$=$=$228$228, $\mu$μ$=$=$189$189, $s=5.8$s=5.8, $\sigma$σ$=$=$4$4
$\overline{x}$x$=$=$189$189, $\mu$μ$=$=$228$228, $s=-5.8$s=−5.8, $\sigma$σ$=$=$-4$−4
$\overline{x}$x$=$=$228$228, $\mu$μ$=$=$189$189, $s=4$s=4, $\sigma$σ$=$=$5.8$5.8