When collecting data we have to decide if we need to conduct a census, which means that data will be collected for every individual, or a sample . A sample is a group of people or objects that are taken from a larger population for measurement. Let's explore different ways to collect a sample.
An important thing when taking a sample is ensuring that the group is representative of the entire population. In other words, we want to make sure there is no bias that may affect our results. There is bias if it includes sampling or selecting based on age, gender, or interests.
A sampling method is biased when one or more members of the population have an increased chance of being selected compared to the rest of the population. A biased sample is one in which some members of the population have a higher or lower sampling probability than others.
When we are convenience sampling we sample a group or set of objects because it is easy to do. For example, you may want to sample a group of people about the type of public transportation they take, and stand outside of a bus station to gather data. Could this lead to any biases in the data?
In a simple random sampling every person or object has the same probability of being chosen. One example would be numbers being drawn out in the lottery. Every number has an equal chance of being chosen. This is an example of unbiased sampling.
Stratification is the process of dividing a group into subgroups with the same characteristics before we draw our random sample. Then we look at the size of each subgroup as a fraction of the total population. The number of items from each subgroup that are included in the sample should be in the same ratio as the amount they represent of the total population.
No person or object should fit into more than one subgroup, and no group of the total population should be excluded.
For example, we decide to sample 50 bags of jelly beans. Here is a list of how many jelly beans we have of each color and how we would calculate the number of jelly beans we would need to collect to create a stratified sample:
Color | Number jelly beans | Proportional Number for Sample |
---|---|---|
\text {red} | 200 | \dfrac{200}{1000} \times 50 = 10 |
\text{yellow} | 180 | \dfrac{180}{1000} \times 50 = 9 |
\text{blue} | 200 | \dfrac{200}{1000} \times 50 = 10 |
\text{green} | 140 | \dfrac{140}{1000} \times 50 = 7 |
\text{black} | 100 | \dfrac{100}{1000} \times 50 = 5 |
\text{purple} | 180 | \dfrac{180}{1000} \times 50 = 9 |
\text{Total} | 1000 | 50 |
If we use systematic sampling, we are basically picking one in every nth item. From the sample, a starting point is chosen at random, and items are chosen at regular intervals. For example, we may choose every 5th name from an alphabetical list or choose every 10th chocolate at a factory to quality test.
The local mayor wants to determine how people in her town feel about the new construction project. Select which type of sampling each scenario uses.
Selecting every 50th name from an alphabetical list of residents.
Giving each resident a random number between 1 and 10 and then selecting everyone with the number 3.
Selecting 10\% of the residents from each suburb.
Beth is interested in which students from her school catch public transport. Select whether the following sampling methods are likely to be biased or not.
Selecting every 10th person on the bus she catches.
Selecting every 10th person on the student list.
Selecting the first 50 students that arrive in the morning.
Selecting by having a computer randomly choose student numbers.
The four sampling methods are:
To have an unbiased sampling method, we want everyone in the population to have an equal chance of being selected.