We can compare samples of two different populations to draw inferences about the populations without having to gather data on every individual in the population.
By using the measures of central tendency of a data set (that is, the mean, median, and mode), as well as measures of spread (such as the range, interquartile range and mean absolute deviation), we can make clear comparisons and contrasts between different groups.
We can also benefit from examining the shape of the distribution of two sets of data when comparing them.
Suppose you want to know whether children's cereals available in your local grocery store have more sugar than adult cereals. You randomly select $20$20 boxes of children's cereals and $20$20 boxes of adult cereals and measure the percent of the weight per serving that contains sugar. Your results can be summarized in the following double box plot:
Sample median (%) | IQR (%) | |
---|---|---|
Adult's cereal | $11$11 | $12.5$12.5 |
Kid's cereal | $46$46 | $6.5$6.5 |
Consider the answers to the following questions:
In the exploration above we saw that the samples of the two different populations had a different in medians that was much larger than the interquartile range. Almost three times bigger, in fact! This supports that there was a meaningful difference between the populations.
In general, if the difference in centers between two population samples is $2$2 or more times greater than the measure of variability, we can say that there is likely a meaningful difference between the populations. Otherwise, we do not have significant evidence to support a difference in the populations.
In general, we can say that there is likely a meaningful difference between two populations if
If measurements from the samples do not show either of the above, then no conclusion can be drawn.
The following box-and-whisker plot shows the number of points scored by two basketball teams in each of their matches last season.
Team A |
Team B |
What is the median score of Team A?
What is the median score of Team B?
What is the range of Team A’s scores?
What is the range of Team B’s scores?
What is the interquartile range of Team A’s scores?
What is the interquartile range of Team B’s scores?
The boxplots summarize results from a medical study. The treatment group received an experimental drug to relieve cold symptoms, and the control group received a placebo. The boxplots show the number of days each group continued to report symptoms.
Which of the following statements are true?
Control group
Treatment group
There is an outlier in the treatment group of $16$16.
True
False
Only the control group plot is skewed to the right.
True
False
The skew is more prominent in the treatment group.
True
False
In the treatment group, cold symptoms lasted $0$0 to $13$13 days ($\text{range }=13$range =13) versus $4$4 to $12$12 days ($\text{range }=8$range =8) for the control group.
True
False
It appears that the drug had a positive effect on patient recovery.
True
False
A scientist examined $10$10 crickets and $10$10 katydids one night, and collected data on how many chirps they made per minute. His observations are presented in the table.
$1$1 | $2$2 | $3$3 | $4$4 | $5$5 | $6$6 | $7$7 | $8$8 | $9$9 | $10$10 | |
---|---|---|---|---|---|---|---|---|---|---|
Crickets | $53$53 | $48$48 | $53$53 | $51$51 | $51$51 | $51$51 | $47$47 | $53$53 | $49$49 | $47$47 |
Katydids | $106$106 | $106$106 | $113$113 | $106$106 | $112$112 | $111$111 | $110$110 | $109$109 | $113$113 | $110$110 |
Calculate the mean number of chirps per minute made by the crickets. Leave your answer to one decimal place if needed.
Hence, calculate the MAD number of cricket chirps.
Crickets are known for their ability to predict temperature. The air temperature (in fahrenheit) can be approximated using the formula $T=N+40$T=N+40, where $N$N is the number of chirps per minute. What temperature is being predicted by this group of crickets?
Katydids can also be used for the same purpose, though the formula converting their chirps per minute to temperature is slightly more complicated, $T=\frac{N+161}{3}$T=N+1613. Calculate the temperature being predicted by the group of katydids if the mean and MAD of their chirps is $109.6$109.6 and $2.28$2.28 respectively. Leave your answer to one decimal place.
If the actual temperature is $90$90°F, did both approximations perform well?
Yes
No
If you were going to use the observation from a single cricket or a single katydid to predict the temperature, which would be better to use according to the MAD of each group?
Katydid
Cricket