When you want to know a statistic about a population, there are two ways of going about finding it out, a census or a sample.
For example, let's say that the school is deciding to get some new tables and chairs, and they want to make sure that they are made so that the "average height" student will find them comfortable. They've asked you to figure out the average height of students in the school. Doing a census would mean getting the height of every single person in your school, and then averaging it. Doing a sample would mean getting the height of a smaller number of people (e.g. 50 people), and using their average height as an estimate of the average height of the people in the school. If you had to choose, which would you prefer to do, a census or a sample?
Most people would prefer to do a sample, because it is so much quicker and easier. However, doing a sample can have problems of it's own, if you happen to choose a bad group of people to use as your "average". For example, see if you can figure out what is wrong with using these groups of people to figure out the average height of all students in your schools:
1. Measure the heights of all the people in Year 7
2. Measure the heights of all the boys
3. Measure the heights of the first 50 people who walk out of the gate once school finishes
4. Measure the heights of the members of the school basketball team
All of these samples would be easier to do than a census, but would probably not get the right answer for the average height of students in the whole school. These are called biased samples, and need to be avoided at all costs when trying to figure out a statistic. There is no point in doing something quickly if you don't get the right answer! Biased samples are a serious problem, and can appear in a surprisingly large number of scenarios.
Come up with a non-biased sample for the average height of students in your school your student population and compare your methods with other students to find the best possible method.
Despite the difficulty of using samples, they are used frequently in the real world. For example, consider the Growth Charts published by the World Health Organisation. These charts show the weight or length that babies should be at a particular age. Have a look at these charts for girls and boys.
It is important that these charts are accurate, as they are used to help doctors identify babies who are unwell and need medical attention. Despite the importance of accuracy, a sample was used instead of a census.
1. Why do you think a census was not done in this case?
2. How could the following factors affect the results of the sample?
(a) Number of babies
(b) Site for recruiting babies
(c) Method of recruiting the babies.
3. Come up with your own "perfect method" for trying to get a result that is as accurate as possible for how much an "average baby" should weigh at a given age.
(a) How many babies would you recruit?
(b) Where would you recruit the babies?
(c) How would you recruit the babies?
4. Have a talk with people around you and see how your methods compare.
Don't worry if your method doesn't seem perfect. It is actually incredibly difficult to make a sample work accurately for this kind of thing, and it requires highly trained statisticians and lots of complicated maths. If you want to see just how complicated, have a look at this 336 page document which outlines the method taken by the World Health Organisation.
Let's say the government wants to find out the average income in Australia, as well as more specific information about how many people there are in different income brackets. Should we use a sample or a census?
1. Think about the advantages and disadvantages of using a sample
2. Think about the advantages and disadvantages of using a census
Income is a very difficult thing to measure with a sample, as it is very easy to end up with a biased sample. See if you can figure out why the following samples would be biased:
1. Asking 1000 people in Edgecliff
2. Asking 1000 people in Utopia
3. Calling 1000 random people from the phonebook of state capitals
4. Calling 1000 random mobile phone numbers
5. Checking the tax returns of 1000 random workers
6. Checking the yearly bank statements of 1000 random people
Now, see if you can come up with a "perfect sample" which would have minimum bias.
In reality, the government uses both samples and census to derive income data for Australia. In order to prepare the "Household Income and Income Distribution" publication, the Australian Bureau of Statistics uses many different sources, including three different surveys of around 10000 households each and the data from the most recent census.
The reason for doing this is to combine the benefits of a sample with those of a census. A sample has the advantage of being relatively cheap and easy, which means they can be conducted frequently (for example, every year or every 3 months). This makes the information highly relevant and up-to-date. By contrast, a census is very expensive and time-consuming, and as a result is only conducted every 5 years. However, a census has the most accurate information, being relatively free from bias. By using the accuracy of the census information to double-check or modify the information from the samples, it is possible to obtain data which is highly accurate and timely.
Let's say that you have just been elected Prime Minister and you've decided you need good statistics about your country in order to run it better. You've decided to do a census, which involves getting information directly from every single member of the population. Have a go at the questions below to think about what sort of information you might want from your census, and how you would go about running it:
If you really were running the country, what kind of information do you think you need to know and why? Make a list of ten questions you would ask every Australian and how it would influence your decision making.
As the government, which population are you interested in when you make decisions about how to run Australia? Is it just the Australian citizens? Or are permanent residents important too? What about other people from other countries who are only staying in Australia temporarily? Define your criteria for inclusion or exclusion in your census, and see if other students agree or disagree with you.
This one seems pretty easy, right? Australia! But if you take a look at this website, and you might find that "Australia" is a bigger place than you knew. What do you do with all those territories and islands?
Now that you've figured out what you're going to ask, who you're going to ask it to and where you are going to ask it, you have to actually decide how you will ask those people. Do you call them? Make them come to a polling centre? Post it to them? Email it to them? Facebook them? Think about the pros and cons of each of the methods.
Every method has a chance of not reaching certain groups of people. If these people are similar to the rest of Australia, then this is not too much of a problem. However, if the group of people who don't answer your census are different in some way, your results will become skewed by this bias. For example, if those who don't answer are poorer, less educated, or older, you will get results which falsely make Australia look richer, more educated and younger than it actually is. For the methods above, think about how they might skew the results of the census by missing out on important groups of people.
When you're done making your own census, compare your census with the real one:
The real census included questions on demographics, income, education, employment, religion, ancestry, and languages spoken at home. The questions were carefully written to provide minimum ambiguity, with important points emphasised in italics to make sure they are not missed. Have a look here to see an example of a real census form.
See if you can find five questions which you did not include in your list, and think about how the government might use this information.
The real census involved everyone who was in Australia on the night of the census. Not just Australian citizens, or Australian permanent residents, or even those who were actually living in Australia at the time, but everyone who happened to be in Australia at the time, including temporary migrants (for example, international students) and visitors (for example, tourists).
See if you can think of some reasons why the following groups of people were included in the census:
1. Permanent residents (non-citizens who have a right to reside in Australia permanently)
2. International students (this article may help)
3. Tourists from overseas (this website may help)
The census covered all of Australia, including Antarctic and island territories.
Why do you think that they included these areas?
The government sent out paper forms to every household in Australia, and also offered an online option, "eCensus". Delivering and collecting these paper forms for the 2016 census was a huge logistical operation, involving millions forms being delivered by 38000 form-deliverers. In total the census operation cost 470 million dollars, including 46 million dollars spent on printing the forms alone. The total number of pages printed for the census was 29750780882. For those who are curious, see if you can figure out how tall the stack of paper would be! This and this may help you.
Despite all the effort of distributing via paper, 63% of households used eCensus, the online method of completing a census. If the online method was so popular, why didn't the government just get rid of the paper forms and make everyone do it online? See if you can think of some reasons why they still use paper forms, and discuss amongst yourselves whether you think the government will stop using paper forms in the future.
The great cost of a census means that they are only done once every five years. The complexity of collecting all the forms and entering the data into a computer also makes the process of calculating the data very slow, and as a result it took nearly a year for the 2016 census results to be released. Again, think about how advances in technology may change this situation. and how it might impact on the way that future censuses are run. The 2016 census experienced a website outage which increased the cost of running the census by 30 million dollars and was a source of embarrassment for the ABS, IBM and the government.
One thing we can say for certain about the future of the census is that, so long as you don't leave the country, you'll end up being a part of one!