In 1936 the US magazine Literary Digest ran a poll to predict who would win the US presidential election out of Alf Landon and Franklin Roosevelt. It sent out 10 million questionnaires and got 2,266,566 responses back. The results of the poll indicated a victory for Landon with 57% of the vote. But come the actual election, Roosevelt won in a landslide with 63% of the vote, with Landon receiving only 37%. How could Literary Digest get it wrong and get it wrong by such a large amount?
Meanwhile, George Gallup conducted a much smaller poll, comprised of only 50,000 people, and correctly predicted Roosevelt’s victory. Not only that, he also correctly predicted the incorrect result of the Literary Digest poll using a random sample smaller than theirs but chosen to match the demographics.
The moral here is that the sampling method is much more important than the sample size. There were two problems with the way the Literary Digest constructed their sample. First, they didn’t use a random sample and so the sample was not representative of the population. Instead, they selected their sample from car registrations, telephone directories, country club memberships and magazine subscriptions; and at that time right after the Great Depression, those people who had cars, phones, magazines, etc were more likely to be wealthy and Republicans.
But the selection bias produced by this particular sample was not as severe as the nonresponse bias. From the 10 million questionnaires sent out, they only got 2,266,566 back – a response rate of just under 23%. Those who had strong opinions, particularly those who were dissatisfied with Roosevelt’s performance as president and wanted change, were more likely to respond, while those who were satisfied were less inclined to complete the questionnaires.
These are just two of many biases that sampling methods can be subject to. A similar but related bias is the voluntary response bias that arises when it is the respondents who decide whether to join the sample. A good example of this is when TV and radio stations try to gauge public opinion by asking their viewers or listeners to call in or to participate in their online poll. But worse still are the reality TV shows like Australian Idol that ask viewers to vote for their favourite contestant. It is important to keep in mind that the winner is not necessarily the contestant that has the most number of admirers among viewers because the majority of people who vote tend to feel strongly about particular contestants and so are not representative of all the viewers. Furthermore, the problem is made worse by the fact that any person can vote an unlimited number of times.
To see the problem presented by the voluntary response bias, consider the following example. There was a US television poll that asked viewers “Do you support the President’s economic plan?” (The president at the time was Bill Clinton.) The table below shows the results of this poll and the results of a properly conducted survey by a market research company.
Television Poll | Proper Survey | |
---|---|---|
Yes | 42% | 75% |
No | 58% | 18% |
Not sure | 0% | 7% |
As you can tell, there is a big difference between the two sets of numbers and this is due to the voluntary response bias in the television poll. The respondents themselves chose to be included in the sample and, as the results show, most of these respondents did not support the plan. Furthermore, there was no “Not sure” option available for respondents to choose, which only made the results of the television poll even more misleading.
Another kind of sampling that will lead to unrepresentative data is convenience sampling. With convenience sampling, those people that are easiest to recruit by the researcher are selected. A good example of this would be trying to gauge public opinion on an issue by asking a few of your friends for their thoughts since they are easy for you to get a hold of. Another example would be conducting a survey at a shopping centre.
A response bias occurs when the phrasing of questions leads people to give a response that doesn’t reflect their true beliefs. For example, a Roper poll in 1993 asked “Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened?” The relatively high number of responses for “possible” shocked many people. But considering the complicated structure of the sentence, many of the people who responded may not have fully understood the question and hence may not have provided a response that reflected what they truly believed. This suspicion was tested when Roper conducted the same poll with a simpler, revised question: “Does it seem possible to you that the Nazi extermination of the Jews never happened, or do you feel certain that it happened?” The table below shows the results of the responses to both questions.
Original poll | Revised poll | ||
---|---|---|---|
Impossible | 65% | Certain that it happened | 91% |
Possible | 22% | Possible that it never happened | 1% |
A local council wants to know people’s opinions about the mayor’s performance and so decides to hold a community forum, inviting members of the public to attend and voice their opinions.
As part of an assignment you are required to determine the most supported NRL team among students at your school.
Explain why the following samples are biased:
Following the proposal in the US by a politician to offer driver’s licenses to illegal immigrants, there was a poll on CNN asking viewers “Would you be more or less likely to vote for a presidential candidate who supports giving drivers’ licenses to illegal aliens?” Given the loaded nature of the question it is not at all surprising that 97% of people responded with “less likely.”