topic badge

7.07 Comparing data sets

Worksheet
Comparing data sets
1

Marge grows two different types of bean plants. She records the number of beans that she picks from each plant for 10 days. Her records are shown below:

  • Plant A: 10,\, 4,\, 4,\, 5,\, 7,\, 10,\, 3,\, 3,\, 9,\, 10

  • Plant B: 8,\, 7,\, 5,\, 5,\, 9,\, 7,\, 8,\, 7,\, 5,\, 6

a

What is the mean number of beans picked per day for Plant A? Round your answer to one decimal place.

b

What is the mean number of beans picked per day for Plant B? Round your answer to one decimal place.

c

What is the range for Plant A?

d

What is the range for Plant B?

e

Which plant produces more beans on average?

f

Which plant has a more consistent yield of beans?

2

The residents of two blocks of townhouses were asked the number of pets they own. The frequency of various responses are presented in the following dot plots:

a

Is the pet ownership a little lower or higher in Block A than Block B?

b

In Block A, how many pets do most households have?

c

In Block B, how many pets do most households have?

d

Describe the shape of the data for Block A.

e

Find the range of the number of pets in Block A.

f

Which block has more variability in the the number of pets?

g

Do either sets of scores have an outlier?

3

The following histograms show the season results of two soccer groups, Group A and Group B, and the number of games (frequency) in which they scored a certain number of goals:

a

Find the mode for Group A.

b

Find the mode for Group B.

c

Find the range for Group A.

d

Find the range for Group B.

e

Which group scored the lowest total number of goals during the season?

f

Which group has the most varied results?

4

Sarah and Georgio completed the same five exams during an exam block. Below are their results:

  • Sarah: 86,\, 83,\, 86,\, 88,\, 98

  • Georgio: 61,\, 83,\, 50,\, 85,\, 83

a

Find Sarah's mean score

b

Find Georgio's mean score

c

Find Sarah's standard deviation, correct to two decimal places.

d

Find Georgio's standard deviation, correct to two decimal places.

e

Who performed better in the five exams? Explain your answer.

5

The pulse rates of two groups are given below:

  • Group 1: 82,\, 85,\, 88,\, 65,\, 73,\, 89,\, 79,\, 90,\, 76,\, 68,\, 88,\, 65,\, 63,\, 62,\, 88,\, 82

  • Group 2: 75,\, 88,\, 74,\, 73,\, 80,\, 76,\, 67,\, 81,\, 71,\, 83,\, 89,\, 62,\, 63,\, 80,\, 71,\, 78

a

Find the mean pulse rate of Group 1, to two decimal places.

b

Find the mean pulse rate of Group 2, to two decimal places.

c

Find the standard deviation of Group 1, to two decimal places.

d

Find the standard deviation of Group 2, to two decimal places.

e

What is the range for Group 1?

f

What is the range for Group 2?

g

Which group has the greater spread?

6

The beaks of two groups of bird are measured, in \text{mm}, to determine whether they might be of the same species. The measurements are shown below:

  • Group 1: 33,\, 39,\, 31,\, 27,\, 22,\, 37,\, 30,\, 24,\, 24,\, 28

  • Group 2: 29,\, 44,\, 45,\, 34,\, 31,\, 44,\, 44,\, 33,\, 37,\, 34

a

Calculate the range for Group 1.

b

Calculate the range for Group 2.

c

Calculate the mean for Group 1.

d

Calculate the mean for Group 2.

e

Do you think the two groups of birds are the same species? Explain your answer.

7

The median house price in the suburb of Humbleton is \$950\,000 with a mean price of \$1\,000\,000 and the median house price in the suburb of Brockway is \$950\,000 with a mean price of \$880\,000.

Which suburb is more likely to have very expensive houses? Explain your answer.

8

The ages of employees at two competing fast food restaurants on a Saturday night are recorded. Some statistics are given in the following table:

a

If the data for Berger's Burgers was represented using a histogram, would it be positively or negatively skewed?

b

Which restaurant has the oldest employee on the night the data is recorded?

MeanMedianRange
Berger's Burgers18176
Fry's Fries18192
c

Which restaurant has the most consistent ages among employees? Explain your answer.

d

Which restaurant has an older workforce? Explain your answer.

9

Two Science classes, each with 20 students, were given a 10 question True/False test. The results for each class are shown below:

a

Calculate the mean for Class 1 correct to one decimal place.

b

Calculate the mean for Class 2 correct to one decimal place.

c

Find the range for Class 1.

d

Find the range for Class 2.

e

Do you think Class 1 studied for their test? Justify your answer.

f

Do you think Class 2 studied for their test? Justify your answer.

10

Two English classes, each with 15 students, sit a 10 question multiple choice test. Their class results, out of 10, are below:

  • Class 1: 3, \,2, \,3, \,3, \,4, \,5, \,1, \,1, \,1, \,4, \,2, \,2, \,3, \,3, \,2

  • Class 2: 8, \,9, \,9, \,8, \,8, \,6, \,8, \,10, \,6, \,8, \,8, \,9, \,6, \,9, \,9

a

Calculate the following (correct to one decimal place where necessary), for Class 1:

i

The mean

ii

The median

iii

The mode

iv

The range

b

Calculate the following (correct to one decimal place where necessary), for Class 2:

i

The mean

ii

The median

iii

The mode

iv

The range

c

Which class was more likely to have studied for their test? Explain your answer.

11

The hours of sleep per night for two people over a two week period are shown below:

  • Person A: 8, \,5, \,10, \,7, \,9, \,7, \,6, \,10, \,6, \,9, \,7, \,7, \,10, \,5

  • Person B: 8, \,8, \,8, \,7, \,7.5, \,8, \,7.5, \,7, \,7, \,7, \,7.5, \,7, \,7, \,7.5

a

Calculate the following (correct to one decimal place where necessary) for Person A:

i

The mean

ii

The median

iii

The mode

iv

The range

b

Calculate the following (correct to one decimal place where necessary) for Person B:

i

The mean

ii

The median

iii

The mode

iv

The range

c

Which person is the least consistent in their sleep habits? Explain your answer.

d

Which person has the most sleep over the 14 nights? Explain your answer.

12

The salaries of men and women working the same job at the same company are given below:

  • Men: \$80,000,\, \$80\,000,\, \$75\,000,\, \$80\,000,\, \$75\,000,\, \$70\,000,\, \$80\,000

  • Women: \$70\,000,\, \$70\,000,\, \$75\,000,\, \$70\,000,\, \$70\,000,\, \$80\,000,\, \$75\,000

a

Calculate the following for the men:

i

The mean

ii

The median

iii

The mode

iv

The range

b

Calculate the following for the women:

i

The mean

ii

The median

iii

The mode

iv

The range

c

Who seems to be getting the higher salary, the men or the women? Explain your answer.

Back to back stem and leaf plots
13

The stem and leaf plot shows the number of books read in a year by a random sample of university and high school students:

a

Interpret the lowest score for the University students.

b

Compare the medians of both groups of students.

c

For which student group(s) is the mean greater than the median?

Univerity StudentsHigh School Students
70
6\ 6\ 310\ 0\ 3\ 5
4\ 3\ 2\ 121\ 2\ 4\ 4\ 6
9\ 8\ 8\ 631\ 8\ 9
8\ 240\ 1
5
6
37

Key: 4 \vert 1 \vert 2 = 14 \text{ books and }12\text{ books}

14

The stem and leaf plot shows the batting scores of two cricket teams, A and B:

a

Find the median score of Team A.

b

Find the median score of Team B.

c

Find the range of Team A’s scores.

d

Find the range of Team B’s scores.

e

Find the interquartile range of Team A’s scores.

f

Find the interquartile range of Team B’s scores.

Team ATeam B
7\ 6\ 262\ 6\ 8
8\ 6\ 6\ 5\ 271\ 5\ 7
8\ 481\ 4\ 7\ 9
94\ 7

Key: 6 \vert 1 \vert 2 = 12 \text{ and } 16

15

The back to back stem and leaf plot shows the amount of cash (in dollars) carried by a random sample of teenage boys and girls:

a

Which group carried more cash?

b

Find the median amount of cash that the boys carried

c

Find the median amount of cash that the girls carried.

d

Which group's distribution is roughly bell shaped?

e

Which group has more variation in the amounts of cash?

f

Were there any outliers in the boys' amounts? If so, what are the value(s)?

g

Were there any outliers in the girls' amounts? If so, what are the value(s)?

BoysGirls
70
111
5\ 4\ 122\ 6\ 8
8\ 5\ 433\ 4\ 4\ 6\ 6\ 8\ 9
9\ 8\ 2\ 2\ 2\ 143\ 4\ 6
9\ 7\ 4\ 354
8\ 5\ 26
3\ 17

Key: 1 \vert 2 \vert 2 = \$21 \text{ and } \$22

16

The following back to back stem and leaf plot shows the length (in minutes) of a random sample of phone calls made by Sharon and Tricia:

a

Who made a 14 minute phone call?

b

Who has the higher median?

c

Is Sharon's mean greater than her median?

d

Is Tricia's mean greater than her median?

SharonTricia
313\ 4
7\ 6\ 4\ 3\ 226\ 7\ 8
9\ 832\ 4
4\ 341\ 2
7\ 656\ 7\ 8

Key: 2 \vert 2 \vert 6 = 22 \text{ and } 26

17

The back to back stem and leaf plot shows the number of pieces of paper used over several days by Charlie’s and Dylan’s students:

a

Did Charlie's students use 7 pieces of paper on any day?

b

Who's class had the higher median?

c

Is the median greater than the mean in both groups?

Charlie's studentsDylan's students
707
3\ 2\ 113
828
4\ 3\ 233\ 4
945\ 6\ 7
252\ 3

Key: 1 \vert 1 \vert 3 = 11 \text{ and } 13

18

The back to back stem and leaf plot shows the number of desserts ordered at Hotel A and Hotel B over several randomly chosen days:

a

Interpret the lowest score for Hotel A.

b

Which hotel's median is higher?

c

Is the mean greater than the median in both groups?

Hotel AHotel B
30
4\ 3\ 213\ 4
7\ 627
4\ 333\ 4
646\ 7
252\ 3\ 4

Key: 2 \vert 1 \vert 3 = 12 \text{ and }13

19

The weight (in kilograms) of a group of men and women were recorded and presented in a back to back stem and leaf plot as shown:

MenWomen
50\ 1\ 2\ 3\ 4\ 4\ 4\ 5\ 5\ 5\ 7
9\ 8\ 8\ 7\ 6\ 6\ 6\ 5\ 360\ 2\ 2\ 3\ 4\ 7\ 7\ 8
6\ 4\ 3\ 2\ 2\ 1\ 0\ 0\ 0\ 070
08

Key: 3 \vert 6 \vert 0 = 63 \text{ and } 60

a

Find the mean weight of the group of men.

b

Find the mean weight of the group of women.

c

Which group is heavier overall? Explain your answer.

Comparing histograms and box plots
20

Construct a box plot for the following histograms:

a
b
c
d
e
f
21

Match the histograms on the left to the corresponding box plots on the right:

Histogram A

Histogram B

Histogram C

Box Plot 1
0
1
2
3
4
5
6
7
8
9
10
Box Plot 2
0
1
2
3
4
5
6
7
8
9
10
Box Plot 3
1
2
3
4
5
6
7
8
9

Histogram D

Box Plot 4
0
1
2
3
4
5
6
7
8
9
10
22

State whether the following pairs of histograms and box plots match with respect to their shape:

a
b
c
d
e
f
23

Explain why the following pairs of histograms and box plots do not match:

a
b
Comparing parallel box plots
24

The test scores of 11 students in Drama and German are listed below.

  • Drama: \,75,\, 85,\, 62,\, 65,\, 52,\, 76,\, 89,\, 83,\, 55,\, 91,\, 77

  • German: \,82,\, 86,\, 76,\, 84,\, 64,\, 73,\, 89,\, 62,\, 54,\, 69,\, 78

Construct parallel box plots to represent both data sets.

25

The following box plots shows the number of points scored by two basketball teams in each of their matches:

Team A
30
32
34
36
38
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68
70
Team B
30
32
34
36
38
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68
70
a

What is the median score of Team A?

b

What is the median score of Team B?

c

What is the range of Team A’s scores?

d

What is the range of Team B’s scores?

e

What is the interquartile range of Team A’s scores?

f

What is the interquartile range of Team B’s scores?

26

Cooper and Marion are racing go-karts. The times (in seconds) for the 12 laps of their qualifying race are shown below:

  • Cooper: \,58.9,\, 46.5,\, 52.6,\, 66.6,\, 58.4,\, 53.1,\, 45.0,\, 52.1,\, 52.4,\, 52.7,\, 44.8,\, 51.7
  • Marion: \, 47.8,\, 54.6,\, 68.5,\, 68.0,\, 62.8,\, 57.2,\, 54.8,\, 63.4,\, 58.1,\, 64.3,\, 66.2,\, 47.1
a

Construct the five-number summary for each set.

b

Identify any outliers and use statistical calculations to justify your answer.

c

Create a parallel box plot of the two sets of times with the outlier(s) displayed separately.

d

Which racer will be in pole position for the final race, if it is given to the racer with the fastest qualifying lap time?

e

Does spinning out on a lap, causing a high outlier, impact the selection for pole position? Explain your answer.

27

Two friends compete in hammer throw competitions and train together over a season. They compete in 15 competitions and their final throw for each competition is shown below:

  • Tim: \,29.8,\, 37.4,\, 33.9,\, 38.8,\, 34.3,\, 36.5,\, 34.5,\, 30.0,\, 35.2,\, 38.4,\, 33.0,\, 33.2,\, 39.6,\, 35.0,\, 36.9
  • Odi: \,32.2,\, 35.4,\, 34.8,\, 33.0,\, 38.4,\, 26.0,\, 40.0,\, 37.2,\, 39.5,\, 42.4,\, 38.6,\, 42.3,\, 38.4,\, 42.8,\, 37.2
a

Complete the following table of statistics:

TimOdi
\text{Minimum}29.8
Q_133.2
\text{Median}35.0
Q_337.4
\text{Maximum}39.6
\text{Mean (} 1 \text{ d.p.)}37.2
\text{Sample standard deviation (}2 \text{ d.p.)}2.93
\text{Range}
\text{Interquartile range}
b

Which competitor throws more consistently? Explain your answer.

c

Identify any outliers and use statistical calculations to justify your answer.

d

Create a parallel box plot of the two sets of data with the outlier(s) displayed separately.

e

Who is the better hammer thrower? Explain your answer.

f

When considering Odi's average throw is it reasonable to remove the outlier before calculating the mean? Explain your answer.

28

Two groups of size twelve take a test to assess their reaction time. The participants clicked a button as soon as they heard a sound which was played at random intervals. The reaction time in milliseconds of each participant is shown below:

  • Group A: \,220,\, 210,\, 220,\, 215,\, 180,\, 185,\, 190,\, 190,\, 195,\, 190,\, 195,\, 195
  • Group B: \,210,\, 170,\, 200,\, 170,\, 190,\, 210,\, 180,\, 200,\, 180,\, 210,\, 190,\, 190
a

Complete the following table of statistics:

Group AGroup B
\text{Minimum}180
Q_1190
\text{Median}195190
Q_3212.5
\text{Maximum}220
\text{Mean (}2 \text{ d.p.)}198.75
\text{Sample standard deviation (}1 \text{ d.p.})14.7
\text{Range}
\text{Interquartile range}
b

Which group had more consistent reaction times?

c

Construct a parallel box plot, showing the reaction times of group A and group B.

d

What can we conclude from the value of group B's first quartile?

e

Using the box plot and table of statistics in part (a), which group generally has the faster reaction times?

f

If group A represent a number of 16 year old males, and group B represents a number of 16 year old females, state a valid conclusion from this data.

29

The following boxplots summarize results from a medical study. The treatment group received an experimental drug to relieve cold symptoms, and the control group received a placebo. The boxplots show the number of days each group continued to report symptoms:

Control group
0
2
4
6
8
10
12
14
16
18
20
Treatment group
0
2
4
6
8
10
12
14
16
18
20
a

Describe the shape of the data from the control group.

b

Describe the shape of the data from the treatement group.

c

Does the drug have a positive effect on patient recovery? Explain your answer.

30

The box plots drawn below show the number of repetitions of a 70\text{ kg} bar that Weightlifter A and Weightlifter B can lift. They both record their repetitions over 30 days:

a

Which weightlifter has the more consistent results? Explain your answer.

b

Which weightlifter can do the most repetitions of the 70\text{ kg} bar? Explain your answer.

Sign up to access Worksheet
Get full access to our content with a Mathspace account

Outcomes

2.1.11

compare groups on a single numerical variable using medians, means, IQRs, ranges or standard deviations, and as appropriate; interpret the differences observed in the context of the data and report the findings in a systematic and concise manner

What is Mathspace

About Mathspace