topic badge

4.09 Compare data sets

Worksheet
Compare data sets
1

The number of goals scored by Team 1 and Team 2 in a football tournament are recorded in the following table:

a

Find the total number of goals scored by both teams in Match C.

b

Find the total number of goals scored by Team 1 across all the matches.

c

Find the mean number of goals scored by Team 1.

d

Find the mean number of goals scored by Team 2.

MatchTeam 1Team 2
\text{A}25
\text{B}42
\text{C}51
\text{D}35
\text{E}23
2

The beaks of two groups of bird are measured, in millimetres, to determine whether they might be of the same species. The measurements are shown below:

  • Group 1: \,33,\, 39,\, 31,\, 27,\, 22,\, 37,\, 30,\, 24,\, 24,\, 28

  • Group 2: 29,\, 44,\, 45,\, 34,\, 31,\, 44,\, 44,\, 33,\, 37,\, 34

a

Complete the following table:

b

Do you think the two groups of birds are the same species? Explain your answer.

MeanRange
Group 1
Group 2
3

Marge grows two different types of bean plants. She records the number of beans that she picks from each plant for 10 days. Her records are shown below:

  • Plant A: \,10,\, 4,\, 4,\, 5,\, 7,\, 10,\, 3,\, 3,\, 9,\, 10

  • Plant B: \,8,\, 7,\, 5,\, 5,\, 9,\, 7,\, 8,\, 7,\, 5,\, 6

a

Complete the following table:

b

Which plant produces more beans on average?

c

Which plant has a more consistent yield of beans?

MeanRange
Plant A
Plant B
4

The following table shows the scores of Student A and Student B in five separate tests:

a

Find the mean test score of each student.

b

What is the combined mean test score of the two students?

c

What was the highest score achieved and which student obtained that score?

d

What was the lowest score achieved and which student obtained that score?

e

Find the sample standard deviation of each student's scores. Round your answers to one decimal place.

f

Which student had more consistent test scores?

TestStudent AStudent B
17479
28997
37993
49987
58671
5

The salaries of men and women working the same job at the same company are given below:

  • Men: \$80\,000,\, \$80\,000,\, \$75\,000,\, \$80\,000,\, \$75\,000,\, \$70\,000,\, \$80\,000

  • Women: \$70\,000,\, \$70\,000,\, \$75\,000,\, \$70\,000,\, \$70\,000,\, \$80\,000,\, \$75\,000

a

Do the data sets have repeated values?

b

Do the data sets have outliers?

c

Which measure of centre is most appropriate to determine which gender has the higher salary?

d

Find the measure of centre from part (c) for the salaries of the men.

e

Find the measure of centre from part (c) for the salaries of the women.

f

Who seems to be getting the higher salary, the men or the women?

6

Two English classes, each with 15 students, sit a 10 question multiple choice test. Their class results, out of 10, are below:

  • Class 1: \,3,\, 2,\, 3,\, 3,\, 4,\, 5,\, 1,\, 1,\, 1,\, 4,\, 2,\, 2,\, 3,\, 3,\, 2

  • Class 2: \,8, \,9, \,9, \,8, \,8, \,6,\, 8,\, 10,\, 6,\, 8,\, 8,\, 9,\, 6,\, 9,\, 9

a

Find the mode for class 1.

b

Find the modes for class 2.

c

Which class was more likely to have studied for their test?

7

The hours of sleep per night for two people over a two week period are shown below:

  • Person A: 8,\, 5,\, 10,\, 7,\, 9,\, 7,\, 6,\, 10,\, 6,\, 9,\, 7,\, 7,\, 10,\, 5

  • Person B: 8,\, 8,\, 8,\, 7,\, 7.5,\, 8,\, 7.5,\, 7,\, 7,\, 7,\, 7.5,\, 7,\, 7,\, 7.5

a

Find the range for Person A.

b

Find the range for Person B.

c

Which person is the least consistent in their sleep habits?

8

The residents of two blocks of townhouses were asked the number of pets they own. The frequency of various responses are presented in the following dot plots:

a

Is the pet ownership a little lower or higher in Block A than Block B?

b

In Block A, how many pets do most households have?

c

In Block B, how many pets do most households have?

d

Describe the shape of the data for Block A.

e

Find the range of the number of pets in Block A.

f

Which block has more variability in the the number of pets?

g

Do either sets of scores have an outlier?

9

Consider the following graph:

a

Which city has the highest average daily sunshine hours?

b

Which city has the month with the least average sunshine hours?

c

Which city has the greatest variation in sunshine hours over the year?

10

Consider the following side-by-side column graphs which compare the perceived and actual number of immigrants for 9 countries:

a

Which country has the greatest difference between actual and perceived number of immigrants?

b

Which country has the smallest difference between actual and perceived number of immigrants?

c

Which data set (perceived or actual) has the biggest range?

11

Consider the column graph that shows the records of a stationery shop on the number of pens and notebooks sold in one week:

a

Find the number of pens sold on Tuesday.

b

Find the number of notebooks sold on Friday.

c

Find the percentage of pens sold on Thursday correct to two decimal places.

d

Find the total number of pens and notebooks sold during the week.

e

Find the average average number of pens and notebooks sold per day.

f

Which is the better selling product?

12

Consider the column graph that shows the number of blood donations per month in a given year:

a

Which state had the most donations?

b

Both states show a period with a lower number of donations due to cold and flu symptoms preventing donors being eligible. Which period is this?

A

Autumn months

B

Summer months

C

Spring months

D

Winter months

c

Which month had the highest monthly donations?

d

If the total donated over a year across all states is 763\,542 units and 2\% is used for trauma and road accidents, how many units of blood are required in a year for these incidents?

13

Consider the column graph that shows the distribution of blood types in Australia, Egypt and the world, as of 2019:

a

Which blood type is most common?

b

Which blood type is the rarest?

c

In which of the blood types does Australia have a significant proportion less than the general world population?

d

Consider the blood type distributions given below:

O+A+B+ABO-A-B-AB-
Australia40\%31\%8\%2\%9\%7\%2\%1\%
Egypt52\%24\%12.4\%3.8\%5\%2\%0.6\%0.2\%

Find the highest percentage difference between the proportion of a particular blood type between Egypt and Australia.

e

If, at the time that this data was collected, Australia had a population of 24\,642\,693 and Egypt had a population of 95\,220\,838, find the difference in the number of people with the blood type in part (d). Round your answer to the nearest whole number.

14

Consider the back-to-back column graph below which compares the ages of male and female students in a primary school:

a

Which distribution is positively skewed?

b
Which distribution has the highest mode?
c

Which distribution has the highest mean?

15

Consider the back-to-back histogram below which compares the unemployment rates of Texas and California over a 30-year period:

a
Which state had the highest range of unemployment rates?
b

Which state had rates most strongly skewed in a positive direction?

c

Which state had the highest modal rate?

d

Which state has a more symmetrical distribution of unemployment rates?

16

Consider the back-to-back histogram below which compares the grades scored by females and males on a mathematics examination:

a

Which gender had the biggest range in scores?

b

Which gender was bi-modal?

c

Which gender had more students?

d

Which gender had more symmetrical results?

e

Which gender was more strongly skewed positively?

17

Consider the back-to-back histograms which compares the heights of female and male standard poodles:

a

Which data set has the highest range?

b

Which data set has the highest mean?

c

Which set is more positively skewed?

18

Consider the two histograms below which show the grade distributions in two university courses:

a

Which distribution has a clear mode?

b

Which distribution would be described as “uniform”?

c

Which distribution has the highest range?

d

Compare the mean and median of Section 1.

e

Which distribution would have the highest standard deviation?

19
Consider the back-to-back histograms below which compare the populations of males and females in Australia in 2017:
a
Is the data for both genders skewed negatively or positively?
b

State the modal class for males.

c

State the modal class for females.

d

Which gender has the highest numbers over 80 years old?

20

Consider the following statistics on road accident fatalities over a 10-year period in Australia:

2009201020112012201320142015201620172018
Male1081982920931852819866956898845
Female407370355369334331338337325294
Total1488135212751300118611501204129312231139
a

Find the mean number of fatalities per year for males and the mean number for females.

b

Find the range of fatalities per year for males and the range for females.

c

In 2018, what percentage of road fatalities were male, and what percentage were female? Round your answers to one decimal place.

d

In 2018, 15.1\% of deaths were males in the age group 17 - 25. How many fatalities were there in this age group? Round your answer to the nearest whole number.

e

In 2016, 31\% of fatalities involved speeding. How many fatalities involved speeding? Round your answer to the nearest whole number.

f

In 2016, 19\% of fatalities involved alcohol. How many fatalities involved alcohol? Round your answer to the nearest whole number.

21

Consider the following statistics on vehicle theft across each state in 2016 and 2017:

\text{State}\text{2016}\text{2017}\text{2016 thefts per }10\,000 \text{ people}\text{2017 thefts per }10\,000 \text{ people}
\text{NSW}11\,90912\,21616.316.7
\text{VIC}19\, 57215\,33234.727.2
\text{QLD}10\, 11711\,12522.024.2
\text{WA}8682764336.732.3
\text{SA}3423294220.617.7
\text{TAS}1198129823.425.4
\text{ACT}939132125.636.0
\text{NT}108798147.042.4
\text{Total}:56\,92752\,858
a

Find the mean number of vehicles stolen per state in 2016 and 2017.

b

What percentage did thefts decrease by from 2016 to 2017? Round your answer to one decimal place.

c

Which state saw the highest increase in vehicle theft?

d

What state represents the highest number of vehicle thefts per capita?

e

In 2017, cars represented 80.58\% of vehicles stolen, and 4 in 5 were recovered. How many cars were recovered in 2017? Round your answer to the nearest whole number.

f

In 2017, motorcycles represented 15.21\% of vehicles stolen, and 53\% were not recovered. How many motorcycles were recovered in 2017? Round your answer to the nearest whole number.

22

The following table shows the number of units of blood donated per week over 10 weeks in two states:

VIC3986373839493909413040793894407937113871
SA1355121312751397118112521372124715011175
a

What was the average number of units donated per week in Victoria and South Australia?

b

Using the average units donated per week for each state, what is an estimate for the number of units collected in Victoria and South Australia over a year? Assume that there are 52 weeks in a year.

c

A rough estimate of the units of blood currently required annually per state is 3\% of the population. If the population of Victoria is 5\,640\,900 and the population of South Australia is 1\,659\,800. Will each state meet the required number of blood donations?

d

One study suggests that demand for blood is expected to increase by 25\% in four years time due to factors such as prolonged ageing. How many units of blood annually will Victoria require in four years’ time?

e

How many more units of blood donated per week would Victoria require on average to sustain this demand in four years’ time? Assume that there are 52 weeks in a year. Round your answer to the nearest whole number.

Back-to-back stem plots and parallel box plots
23

The test scores of twelve students in Music and French are listed below:

  • Music: \, 79,\, 59,\, 74,\, 94,\, 51,\, 71,\, 93,\, 84,\, 69,\, 61,\, 86,\, 86

  • French: \, 62,\, 71,\, 64,\, 82,\, 83,\, 99,\, 87,\, 89,\, 66,\, 73,\, 59,\, 76

Display the data in a back-to-back stem plot.

24

The weight (in kilograms) of two groups, A and B, were recorded in a stem plot as shown:

a

Find the mean weight of Group A.

b

Find the mean weight of Group B.

c

Which group contains individuals that are generally heavier?

d

Calculate the sample standard deviation for Group A correct to one decimal place.

e

Calculate the sample standard deviation for Group B correct to one decimal place.

f

Which group had more consistent weights?

Group AGroup B
50\ 1\ 1\ 2\ 3
7\ 6\ 5\ 3\ 060\ 0\ 2\ 3
2\ 2\ 2\ 1\ 070

Key: 1|8|3 = 81 \text{ and } 83

25

The Cancer Council surveyed 60 random people, asking them approximately how many hours they spent in the sun in the last month. The responders were split up into two groups, tourists and local residents and the results are shown below:

a

What is the median number of hours that each group spent in the sun?

b

If the two groups were combined, what would be the median number of hours spent in the sun?

c

If the two groups were combined, what would be the range of responses?

TouristsLocals
9\ 8\ 7\ 7\ 6\ 5\ 4\ 4\ 3\ 1\ 112\ 4\ 4\ 5\ 5\ 5\ 6\ 8\ 9
9\ 5\ 1\ 022\ 3\ 5\ 9
9\ 7\ 6\ 130\ 2\ 5
7\ 6\ 6\ 5\ 4\ 3\ 140\ 0\ 2\ 3\ 6\ 6\ 7\ 8
9\ 6\ 3\ 052\ 2\ 3\ 5\ 8\ 9

\text{Key: }6 \vert 1 \vert 2 = 12 \text{ and } 16

26

The following stem-and-leaf plot shows the length (in minutes) of a random sample of phone calls made by Sharon and Tricia:

a

Who made a 14-minute phone call?

b

Who has the higher median?

c

Is Sharon's mean greater than her median?

d

Is Tricia's mean greater than her median?

SharonTricia
313\ 4
7\ 6\ 4\ 3\ 226\ 7\ 8
9\ 832\ 4
4\ 341\ 2
7\ 656\ 7\ 8

Key: 2 \vert 2 \vert 6 = 22 \text{ and } 26

27

The back-to-back stem-and-leaf plot shows the number of desserts ordered at Hotel A and Hotel B over several randomly chosen days:

a

Interpret the lowest score for Hotel A.

b

Which hotel's median is higher?

c

Is the mean greater than the median in both groups?

Hotel AHotel B
30
4\ 3\ 213\ 4
7\ 627
4\ 333\ 4
646\ 7
252\ 3\ 4

Key: 2 \vert 1 \vert 3 = 12 \text{ and }13

28

The stem-and-leaf plot shows the batting scores of two cricket teams, A and B:

a

What is the highest score in Team A?

b

What is the highest score in Team B?

c

Find the mean score of Team A.

AB
5\ 232\ 3\ 5\ 7\ 9
9\ 8\ 5\ 4\ 2\ 142\ 9
8\ 253\ 6
64

Key: 6 | 1 | 2 = 12 \text{ and } 16

29

The data below shows the results of a survey conducted on the price of concert tickets locally and the price of the same concerts at an international venue:

a

What was the most expensive ticket price at the international venue?

b

What was the median ticket price at the international venue?

c

What percentage of local ticket prices were cheaper than the international median?

d

At the international venue, what percentage of tickets cost between \$90 and \$110 (inclusive)?

e

At the local venue, what percentage of tickets cost between \$90 and \$100 (inclusive)?

LocalInternational
7\ 5\ 2\ 260\ 5
9\ 6\ 5\ 4\ 072\ 3\ 8\ 8
9\ 6\ 5\ 3\ 082\ 3\ 7\ 8
8\ 7\ 4\ 3\ 190\ 1\ 6\ 7\ 9
5100\ 2\ 3\ 5\ 8

Key: 6|1|2 = \$16 \text{ and }\$12

30

The stem-and-leaf plot shows the test scores of two Year 11 classes, A and B:

a

Find the highest score in Class A.

b

Find the highest score in Class B.

c

Find the mean score of Class A, to two decimal places.

d

Find the mean score of Class B, to two decimal places.

e

Calculate the overall mean of all of the Year 11 students, to two decimal places.

Class AClass B
8\ 3\ 062\ 4\ 6
9\ 7\ 6\ 3\ 173\ 5\ 8
8\ 281\ 3\ 6\ 8
92\ 5

Key: 6 | 1 | 2 = 12 \text{ and } 16

31

The stem-and-leaf plot shows the batting scores of two cricket teams, A and B:

a

Find the median score of Team A and the median for Team B.

b

Find the range of scores for Team A and the range for Team B.

c

Find the interquartile range for Team A and also for Team B.

d

Find the sample standard deviation for Team A and also for Team B, to two decimal places.

e

Which team had more varied scores?

Team ATeam B
7\ 6\ 262\ 6\ 8
8\ 6\ 5\ 2\ 271\ 5\ 7
8\ 481\ 4\ 7\ 9
94\ 7

Key: 6|1|2 = 12 \text{ and } 16

32

The number of vehicles sold by two companies each week from a dealership over three months was recorded in the back-to-back stem plot:

a

Find the five number summary for the weekly number of vehicles sold over these three months by Company A.

b

Find the five number summary for the weekly number of vehicles sold over these three months by Company B.

c

Draw parallel box plots for this data.

Company ACompany B
5\ 003\ 9
8\ 7\ 4\ 1\ 1\ 010\ 2\ 2\ 2\ 3\ 7
9\ 2\ 1\ 020\ 1\ 1\ 7
931

Key: 2 \vert 1 \vert 0 = 12 \text{ and }10

33

The data below shows the results of a survey conducted on the price of concert tickets locally and the price of the same concert at an international venue:

a

Find the five number summary for the price of concert tickets at local venues.

b

Find the five number summary for the price of concert tickets at international venues.

c

Draw parallel box plots for this data.

LocalInternational
7\ 6\ 3\ 061\ 8
8\ 6\ 4\ 3\ 273\ 5\ 5\ 9
9\ 6\ 5\ 1\ 181\ 5\ 7\ 9
8\ 7\ 5\ 2\ 091\ 3\ 4\ 6\ 8
1101\ 2\ 4\ 7\ 8

Key: 2 \vert 6 \vert 0 = 62 \text{ and }60

34

The batting scores of two cricket teams, A and B, are recorded in the back-to-back stem plot below:

a

Find the five number summary for the batting scores of Team A.

b

Find the five number summary for the batting scores of Team B.

c

Draw parallel box plots for this data.

Team ATeam B
9\ 532\ 3\ 6\ 6\ 8
8\ 8\ 5\ 5\ 4\ 142\ 9
9\ 550\ 8
62

Key: 2 \vert 3 \vert 0 = 32 \text{ and }30

35

The back-to-back stem plot shows the test scores of two classes, A and B:

Draw parallel box plots for this data.

Class AClass B
5\ 5\ 061\ 5\ 9
9\ 8\ 4\ 3\ 271\ 4\ 7
5\ 580\ 4\ 7\ 9
92\ 7

Key: 2 \vert 7 \vert 0 = 72 \text{ and }70

36

The back-to-back stem plot shows the number of pieces of paper used over several days by two classes, A and B:

a

Did Class A's students use 7 pieces of paper on any day?

b

Is Class B's median is higher than Class A’s median?

c

Is the median is greater than the mean in both groups?

d

Draw parallel box plots for this data.

e

Which class used more paper?

Class AClass B
707
311\ 2\ 3
828
4\ 332\ 3\ 4
7\ 6\ 549
3\ 252

Key: 2 \vert 1 \vert 0 = 12 \text{ and }10

37

Ten participants had their pulse measured before and after exercise with results shown in the stem-and-leaf plot:

a

Complete the following table:

Mode(s)MeanRange
Before
After
b

Draw parallel box plots for this data.

c

Hence, what can you conclude about exercise and pulse rate before and after exercise from the given data?

Pulse Rate Before ExercisePulse Rate After Exercise
5\ 5\ 05
9\ 9\ 7\ 46
4\ 37
084
95\ 7\ 8
103
113\ 5\ 5
120\ 1
\text{ Key: } 6|1|2 =12 \text{ and } 16
38

The data below represents how long each student in two different classes could hold their breath for, measured to the nearest second:

  • Mrs Nguyen's class: \, 55,\, 59,\, 61,\, 66,\, 71,\, 75,\, 80,\, 89,\, 91,\, 95,\, 101,\, 103,\, 103,\, 109,\, 111

  • Miss Humphreys's class: \, 51,\, 66,\, 67,\, 68,\, 77,\, 78,\, 79,\, 81,\, 83,\, 85,\, 85,\, 86,\, 92,\, 101,\, 110

a

Display the data in a back-to-back stem plot.

b

Who is the teacher of the student who can hold their breath the longest?

c

If you want to determine which class, in general, has the stronger breath hold capacity, which measure would be most appropriate to use?

d

If you want to determine which class, in general, has the more consistent breath hold capacity, which measure would be most appropriate to use?

39

Two friends have been growing sunflowers. They have measured the height of their sunflowers to the nearest centimetre, with their results shown below:

  • Tricia: \, 39,\, 18,\, 14,\, 44,\, 37,\, 18,\, 23,\, 28

  • Quentin: \, 49,\, 25,\, 42,\, 5,\, 47,\, 12,\, 15,\, 8,\, 35,\, 22,\, 28,\, 6,\, 21

a

Display the data in a back-to-back stem plot.

b

Find the median height of Tricia's sunflowers.

c

Find the median height of Quentin's sunflowers.

d

Find the mean height of Tricia's sunflowers.

e

Find the mean height of Quentin's sunflowers. Round your answer to two decimal places.

f

Which friend generally grows taller plants?

40

The box plots drawn below show the number of repetitions of a 70\text{ kg} bar that Weightlifter A and Weightlifter B can lift. They both record their repetitions over 30 days:

a

Which weightlifter has the more consistent results? Explain your answer.

b

Which weightlifter can do the most repetitions of the 70\text{ kg} bar? Explain your answer.

41

The test scores of 11 students in Drama and German are listed below.

  • Drama: \,75,\, 85,\, 62,\, 65,\, 52,\, 76,\, 89,\, 83,\, 55,\, 91,\, 77

  • German: \,82,\, 86,\, 76,\, 84,\, 64,\, 73,\, 89,\, 62,\, 54,\, 69,\, 78

a

Construct parallel box plots to represent both data sets.

b
Which test had more consistent results? Explain your answer.
42

Two friends compete in a game which involves some luck in dice rolls and some skill in the strategy played. They play 10 games and their scores for each game are shown below:

  • Caitlin: \,37,\, 52,\, 47,\, 43,\, 55,\, 56,\, 10,\, 46,\, 59,\, 36
  • Ben: \,38,\, 62,\, 12,\, 45,\, 72,\, 21,\, 125,\, 26,\, 62,\, 50
a

Complete the following table:

CaitlinBen
\text{Minimum}10
Q_137
\text{Median}46.547.5
Q_355
\text{Maximum}59
\text{Mean}44.1
\text{Sample standard deviation (}1 \text{ d.p.})32.3
\text{Range}
\text{Interquartile range}
b

Whose scores are less consistent? Justify your answer.

c

If a person has wildy inconsistent scores, what might this suggest about the player's strategy?

d

If a person has very consistent scores, what might this suggest about the player's strategy?

e

What comparison can be made from the value of Ben's third quartile?

43

Two groups of size twelve take a test to assess their reaction time. The participants clicked a button as soon as they heard a sound which was played at random intervals. The reaction time, in milliseconds, of each participant is shown below:

  • Group A: \,220,\, 210,\, 220,\, 215,\, 180,\, 185,\, 190,\, 190,\, 195,\, 190,\, 195,\, 195
  • Group B: \,210,\, 170,\, 200,\, 170,\, 190,\, 210,\, 180,\, 200,\, 180,\, 210,\, 190,\, 190
a

Complete the following table of statistics:

Group AGroup B
\text{Minimum}180
Q_1190
\text{Median}195190
Q_3212.5
\text{Maximum}220
\text{Mean (}2 \text{ d.p.)}198.75
\text{Sample standard deviation (}1 \text{ d.p.})14.7
\text{Range}
\text{Interquartile range}
b

Which group had more consistent reaction times?

c

Construct a parallel box plot, showing the reaction times of Group A and Group B.

d

What can we conclude from the value of Group B's first quartile?

e

Using the box plot and table of statistics in part (a), which group generally has the faster reaction times?

f

If Group A represent a number of 16 year old males, and Group B represents a number of 16 year old females, state a valid conclusion from this data.

Sign up to access Worksheet
Get full access to our content with a Mathspace account

Outcomes

3.3.2.3

compare parallel box plots and back-to-back stem plots for different datasets [complex]

3.3.2.4

compare the characteristics of the shape of histograms using symmetry, skewness and bimodality, where applicable [complex]

What is Mathspace

About Mathspace