8. Probability & Statistics

Lesson

The most important thing when taking a sample is that it is representative of the population. In other words, we want to make sure there is no bias that may affect our results. There are different ways to collect a sample. We'll go through some of them now.

An example of random sampling is numbers being drawn out in the lottery. Every number has an equal probability of being chosen. Each individual is chosen at random (by chance). In other words, each individual has the same probability of being chosen.

Think of a pack of jelly beans. There are lots of different colors in the pack aren't there? Instead of considering them as a whole group of jellybeans, we could divide them up by color into subgroups.

Stratification is the process of dividing a group into subgroups with the same characteristics before we draw our random sample. Then we look at the size of each subgroup as a fraction of the total population. The number of items from each subgroup that are included in the sample should be in the same ratio as the amount they represent of the total population.

For example, say we decide to survey $50$50 students to find out what types of music the students at our high school liked best. It is likely that Year $7$7 students may have a different taste in music to Year $12$12 students.

Here is a list of how many students are in each year and how we would calculate the number of students from each year we would need to survey to create a stratified sample:

School Grade | Number of Students | Proportional Number for Sample |
---|---|---|

$7$7 | $200$200 | $\frac{200}{1000}\times50=10$2001000×50=10 |

$8$8 | $180$180 | $\frac{180}{1000}\times50=9$1801000×50=9 |

$9$9 | $200$200 | $\frac{200}{1000}\times50=10$2001000×50=10 |

$10$10 | $140$140 | $\frac{140}{1000}\times50=7$1401000×50=7 |

$11$11 | $100$100 | $\frac{100}{1000}\times50=5$1001000×50=5 |

$12$12 | $180$180 | $\frac{180}{1000}\times50=9$1801000×50=9 |

Total |
$1000$1000 | $50$50 |

Remember!

No individual should fit into more than one subgroup, and no group of the total population should be excluded.

If we use systematic sampling, we are basically picking one in every $n$`n`^{th} item. From the sample, a starting point is chosen at random, and items are chosen at regular intervals. For example, we may choose every fifth name from an alphabetical list.

The local mayor wants to determine how people in her town feel about the new construction project. Select which type of sampling each scenario uses.

Selecting every $50$50th name from an alphabetical list of residents.

Stratified sampling

ASystematic sampling

BConvenience sampling

CSimple random sampling

DGiving each resident a random number between $1$1 and $10$10 and then selecting everyone with the number $3$3.

Stratified sampling

ASystematic sampling

BConvenience sampling

CSimple random sampling

DSelecting $10%$10% of the residents from each suburb.

Stratified sampling

ASystematic sampling

BConvenience sampling

CSimple random sampling

D

The owner of a movie cinema wants to use stratified sampling in their survey of people that come to their cinema.

Which two of the following methods would be considered as stratified sampling?

Interview $10%$10% of the people who used the candy bar and $10%$10% of people who didn't.

AInterview every person that sees a romantic movie.

BInterview $10%$10% of the people from each movie.

CInterview every $10$10th person that purchases a ticket.

D

Beth is interested in which students from her school catch public transport. Select whether the following sampling methods are likely to be biased or not.

Selecting every $10$10th person on the bus she catches.

Biased

ANot biased

BSelecting every $10$10th person on the student list.

Biased

ANot biased

BSelecting the first $50$50 students that arrive in the morning.

Biased

ANot biased

BSelecting by having a computer randomly choose student numbers.

Biased

ANot biased

B

Understand that statistics can be used to gain information about a population by examining a sample of the population; generalizations about a population from a sample are valid only if the sample is representative of that population. Understand that random sampling tends to produce representative samples and support valid inferences.

Use data from a random sample to draw inferences about a population with an unknown characteristic of interest. Generate multiple samples (or simulated samples) of the same size to gauge the variation in estimates or predictions.