Businesses, organisations and governments all gather data and conduct surveys to help them make decisions about what people want. The Australian Census, which is conducted by the Australian Bureau of Statistics, is an example of a large-scale data collection. Every Australian citizen is required to fill in a survey so we get a picture of the characteristics of the Australian population. In a census, every member of a population is questioned. In maths, a population does not necessarily refer to the population of a country. It just means every member of a group. It may be a school's population, a sports club's population and so on.
In a census, every member of a population is surveyed. In an unbiased sample, a representative proportion of the population is surveyed.
If you can survey every member of a population, it is the best way to gather information. However, sometimes it is impractical or way too expensive! So sometimes it's better to take a sample that is representative of the wider population.
The most important thing when taking a sample is that it is representative of the population. In other words, we want to try and ensure there is no bias that may affect our results. There are different ways to collect a sample. We'll go through some of them now.
An example of simple random sampling is numbers being drawn out in the lottery. Every number has an equal probability of being chosen. Each individual is chosen at random (by chance). In other words, each individual has the same probability of being chosen.
Think of a pack of jelly beans. There are lots of different colours in the pack aren't there? Instead of considering them as a whole group of jellybeans, we could divide them up by colour into subgroups.
Stratification is the process of dividing a group into subgroups with the same characteristics before we draw our random sample. Then we look at the size of each subgroup as a fraction of the total population. The number of items from each subgroup that are included in the sample should be in the same ratio as the amount they represent of the total population.
For example, say we decide to survey $50$50 students to find out what types of music the students at our high school liked best. It is likely that Year $7$7 students may have a different taste in music to Year $12$12 students.
Here is a list of how many students are in each year and how we would calculate the number of students from each year we would need to survey to create a stratified sample:
School Year | Number of Students | Proportional Number for Sample |
---|---|---|
$7$7 | $200$200 | $\frac{200}{1000}\times50=10$2001000×50=10 |
$8$8 | $180$180 | $\frac{180}{1000}\times50=9$1801000×50=9 |
$9$9 | $200$200 | $\frac{200}{1000}\times50=10$2001000×50=10 |
$10$10 | $140$140 | $\frac{140}{1000}\times50=7$1401000×50=7 |
$11$11 | $100$100 | $\frac{100}{1000}\times50=5$1001000×50=5 |
$12$12 | $180$180 | $\frac{180}{1000}\times50=9$1801000×50=9 |
Total | $1000$1000 | $50$50 |
For stratified sampling, no individual should fit into more than one subgroup, and no group of the total population should be excluded.
If we use systematic sampling, we are basically picking every $n$nth item. From the sample, a starting point is chosen at random, and items are chosen at regular intervals. For example, we may choose every fifth name from a list or call every tenth business in the phone book.
The image to the left shows every $3$3rd person being picked.
This method is easy to do and reduces bias. Some caution must be taken to make sure the list doesn't have any patterns or clusters that could bias.
Irene is interested in which students from her school catch public transport. Select whether the following sampling methods are likely to be biased or not.
Selecting every $10$10th person on the bus she catches.
Biased
Not biased
Selecting every $10$10th person on the student list.
Biased
Not biased
Selecting the first $50$50 students that arrive in the morning.
Biased
Not biased
Selecting by having a computer randomly choose student numbers.
Biased
Not biased
The local mayor wants to determine how people in her town feel about the new construction project. Select which type of sampling each scenario uses.
Selecting every $50$50th name from an alphabetical list of residents.
Stratified sampling
Systematic sampling
Convenience sampling
Simple random sampling
Giving each resident a random number between $1$1 and $10$10 and then selecting everyone with the number $3$3.
Stratified sampling
Systematic sampling
Convenience sampling
Simple random sampling
Selecting $10%$10% of the residents from each suburb.
Stratified sampling
Systematic sampling
Convenience sampling
Simple random sampling
A group of people is divided into four teams - Blue, Red, Green, Yellow. The table shows the number of people in each team:
Team | Number of People |
---|---|
Blue | $180$180 |
Red | $390$390 |
Green | $240$240 |
Yellow | $330$330 |
How many people are there combined in all of the teams?
What fraction of people will be selected if a stratified sample of $38$38 is to be taken from the group?
For the sample to be stratified, how many people should be chosen from the blue team?
For the sample to be stratified, how many people should be chosen from the green team?