topic badge

1.01 Analysing and classifying data

Worksheet
Types of data
1

State whether the following are examples of univariate data or bivariate data:

a

A school collects data on the weight of each student in two classes in order to compare the classes.

b

A scientist collects data on iron levels in soil and growth of a type of weed in order to investigate the relationship between them.

c

A school collects data on the shoe size of each student in the school.

d

A hockey club collects data on time travelled to get to training from members of three teams in order to compare them.

e

A political group collects data on the taxable income and the addresses of the local electoral constituents in order to investigate if there is an association.

f

A psychologist collects data on the number of days in daycare and Naplan results in grade three in order to investigate the relationship between them.

g

John wishes to investigate whether there is an association between the amount of natural sunlight in a classroom and student’s exam results.

h

Emma wishes to investigate whether players in the A hockey team have higher average test results than players in the B hockey team.

2

Classify the following examples of data as either:

  • Numerical discrete

  • Numerical continuous

  • Categorical nominal

  • Categorical ordinal

a

The number of people at an athletics carnival

b

The time spent playing games each day

c

Length of pencils in mm

d

Year of birth

e

Time taken to get to school in minutes

f

Favourite movies

g

Weight of dogs in kg

h

Number of siblings

i

Number of births

j

Driving license status (learner, red P, etc)

k

Hair colour (black, red, blonde, etc)

l

Hourly rate of pay

m

Country of birth

Mean, mode, median and range
3

Find the mode of the following scores:

2, 2, 6, 7, 7, 7, 7, 11, 11, 11, 13, 13, 16, 16

4

A rating system of 1 - 3 was used in a survey to determine the usefulness of a new feature. The ten scores shown below are known to have a mode of 1.

3, 2, 3, 2, 1, 3, 1, 1, 2, x

Find the missing score, x.

5

Find the median of 7, 4, 6, 3.

6

A set of 69 scores is arranged in ascending order. In what position does the median score lie?

7

In a set of 152 scores, between which two scores does the median lie?

8

The adjacent stem and leaf plot shows the prices, in dollars, of concert tickets locally and internationally:

a

What was the most expensive ticket price at the international venue?

b

What was the median ticket price at the international venue? Round your answer to two decimal places.

c

What percentage of local ticket prices were cheaper than the international median?

d

At the international venue, what percentage of tickets cost between \$90 and \$110?

e

At the local venue, what percentage of tickets cost between \$90 and \$100?

International PricesLocal Prices
8\ 261\ 3\ 7\ 9
8\ 8\ 3\ 173\ 4\ 4\ 7\ 9
7\ 5\ 2\ 181\ 2\ 3\ 5\ 8
7\ 6\ 5\ 1\ 191\ 3\ 4\ 5\ 8
8\ 5\ 4\ 1\ 0104

Key: 2 \vert 6 \vert 0 = 62 \text{ and }60

9

Find the mean of the following sets of scores:

a

22.4, 25.4, 19.1, 24.3, 7.4

b

- 14, 0, - 2, - 18, - 8, 0, - 15, - 1.

10

The following five numbers have a mean of 11:

11, 13, 9, 13, 9

If a new number is added that is smaller than 9, will the mean will be higher or lower?

11

Find the range of the following sets of scores:

a

10, 7, 2, 14, 13, 15, 11, 4

b

15, - 2 , - 8 , 8, 15, 6, - 16 , 15

12

Durations of calls (in minutes) made in a household were recorded as follows:

5,\text{ }\text{ } 11,\text{ }\text{ } 3,\text{ }\text{ } 9,\text{ }\text{ } 5,\text{ }\text{ } 14,\text{ }\text{ } 5,\text{ }\text{ } 14,\text{ }\text{ } 14,\text{ }\text{ } 3,\text{ }\text{ } 7,\text{ }\text{ } 7,\text{ }\text{ } 7,\text{ }\text{ } 5,\text{ }\text{ } 3,\text{ }\text{ } 3,\text{ }\text{ } 7,\text{ }\text{ } 14,\text{ }\text{ } 5

a

What was the total number of calls made?

b

What was the longest duration of a call?

c

What was the shortest duration of a call?

d

What was the mean duration of a call? Round your answer to two decimal places.

e

What was the modal duration?

f

What was the median duration?

13

A real estate agent wanted to determine a typical house price in a certain area. He gathered the selling price of some houses (in dollars):

317\,000, \text{ }\text{ }320\,000,\text{ }\text{ } 347\,000,\text{ }\text{ } 360\,000,\text{ }\text{ } 378\,000,\text{ }\text{ } 395\,000,\text{ }\text{ } 438\,000,\text{ }\text{ } 461\,000,\text{ }\text{ } 479\,000,\text{ }\text{ } 499\,000

a

Calculate the mean house price.

b

What percentage of the house prices exceed the mean?

c

Determine the median house price.

d

What percentage of house prices exceed the median?

14

Susanah has been growing watermelons. The weights of the watermelons (in kilograms) are: 15,\text{ }\text{ } 6,\text{ }\text{ } 5,\text{ }\text{ } 2,\text{ }\text{ } 4,\text{ }\text{ } 4,\text{ }\text{ } 5

a

Calculate the median weight of the watermelons.

b

Calculate the mean weight. Round your answer to two decimal places.

c

Which measure of centre is a more accurate description of the centre of this data set?

15

The median house price in Humbleton is \$950\,000 with a mean price of \$1\,000\,000, and the median house price in Brockway is \$950\,000 with a mean price of \$880\,000.

Which town is most likely to have some very expensive houses? Explain your answer.

16

The selling price of recently sold houses are given:

\$467\,000, \$413\,000, \$410\,000, \$456\,000, \$487\,000, \$929\,000

a

What is the mean selling price, rounded to the nearest thousand dollars?

b

Which of the prices raised the mean so that it is not reflective of most of the prices?

c

Recalculate the mean selling price excluding this outlier.

17

A group of students had a range in marks of 14 and the lowest score was 9. What was the highest score in the group?

18

Marge grows two different types of bean plants. She records the number of beans that she picks from each plant for 10 days. Her records are as follows:

  • Plant A: 4,\text{ } 4, \text{ }5, \text{ }7, \text{ }10,\text{ } 3,\text{ } 3,\text{ } 9,\text{ } 10

  • Plant B: 8,\text{ } 7,\text{ } 5,\text{ } 5,\text{ } 9,\text{ } 7,\text{ } 8,\text{ } 7,\text{ } 5,\text{ } 6

a

What is the mean number of beans picked per day for Plant A? Round your answer to one decimal place.

b

What is the mean number of beans picked per day for Plant B? Round your answer to one decimal place.

c

What is the range for Plant A?

d

What is the range for Plant B?

e

Which plant produces more beans on average?

f

Which plant has a more consistent yield of beans?

19

The beaks of two groups of birds are measured, in millimetres, to determine whether they might be of the same species:

Group 140413249343443473738
Group 255544444545443474139
a

Calculate the range for Group 1.

b

Calculate the range for Group 2.

c

Calculate the mean for Group 1. Round your answer to one decimal place.

d

Calculate the mean for Group 2. Round your answer to one decimal place.

e

How can we tell that the two groups of birds are most likely different species?

20

Two English classes, each with 15 students, sit a ten question multiple choice test. Their class results, out of 10, are below:

Class 1333151242423321
Class 287108869879981089
a

Calculate the mean, median, mode and range for Class 1. Round your answers to one decimal place if necessary.

b

Calculate the mean, median, mode and range for Class 2. Round your answers to one decimal place if necessary.

c

Which class was more likely to have studied effectively for their test?

d

Which statistical calculations support your answer?

21

10 participants had their pulse measured before and after exercise with results shown in the adjacent stem and leaf plot:

a

Calculate the modal pulse rate after exercise.

b

How many modes are there for the pulse rate before exercise?

c

Calculate the range of pulse rates before exercise.

d

Calculate the range of pulse rates after exercise.

e

Calculate the mean pulse rate before exercise.

f

Calculate the mean pulse rate after exercise.

g

What can you conclude about the range and mean of pulse rates before and after exercise?

Pulse rate before exercisePulse rate after exercise
0\ 5\ 55
4\ 7\ 9\ 96
3\ 47
084
95\ 7\ 8
103
113\ 5\ 5
120\ 1

Key: 2 \vert 6 \vert 0 = 62 \text{ and }60

22

Consider the frequency distribution table below:

a

Complete the table.

b

Calculate the mean, correct to two decimal places.

c

Calculate the mode.

d

Calculate the range.

e

How many scores are less than the mode?

\text{Score } (x)\text{Frequency } (f)fx
411
535
16
14
\text{Total}43365
Five point summary and box plots
23

Consider the following set of scores:

13, \text{ }15,\text{ } 5, \text{ }16,\text{ } 7,\text{ } 20,\text{ } 12

a

Calculate the median.

b

Calculate the range.

c

Calculate the first quartile.

d

Calculate the third quartile.

e

Calculate the interquartile range.

24

Consider the following set of scores:

- 3,\text{ } - 3,\text{ } 1,\text{ } 9,\text{ } 9,\text{ } 6,\text{ } - 9

a

Calculate the median.

b

Calculate the first quartile.

c

Calculate the third quartile.

d

Calculate the interquartile range.

25

There is a test to measure the Emotional Quotient (EQ) of an individual. Below are the EQ results for 21 people, listed in ascending order:

92,\text{ } 94,\text{ } 100,\text{ } 103,\text{ } 103,\text{ } 105,\text{ } 105,\text{ } 109,\text{ } 110,\text{ } 113, \text{ } 114,

114,\text{ } 116,\text{ } 118,\text{ } 118,\text{ } 119,\text{ } 120,\text{ } 125,\text{ } 125,\text{ } 126,\text{ } 130

a

Find the median.

b

Find Q_1.

c

Find Q_3.

26

Consider the following set of scores:

10,\text{ } 11,\text{ } 12,\text{ } 13,\text{ } 15,\text{ } 17, \text{ }19,\text{ } 20

Within what range do the middle 50\% of scores lie?

27

In competition, a diver must complete 8 rounds of dives. Her scores for the first 7 rounds are given below:

7.3,\text{ } 7.4,\text{ } 7.7,\text{ } 8.4,\text{ } 8.7,\text{ } 8.9,\text{ } 9.4

Determine her score in the 8th round if the upper quartile of all 8 scores is 8.85.

28

Consider the dot plot below:

a

Determine the first quartile.

b

Determine the third quartile.

c

Calculate the interquartile range.

d

Calculate the range.

29

Consider the following set of scores displayed in the bar chart:

a

Create a cumulative frequency table for this data, with column titles: x, f, fx, and cf.

b

Calculate the median score.

c

Calculate the first quartile.

d

Calculate the third quartile.

e

Calculate the interquartile range.

30

A set of data has a five-number summary as shown in the table:

a

Calculate the interquartile range.

A fence is a value 1.5 \times IQR above the UQ or below the LQ.

b

Calculate the value of the lower fence.

c

Calculate the value of the upper fence.

Minimum5
Lower quartile6
Median12
Upper quartile17
Maximum28
31

A group of Year 12 students were asked how many hours they spend on Hashtagram per day. The results are given below:

1.9, 1.1, \text{ }2.4, 2.3, \text{ }2.1, 1.2, \text{ }1.3, 1.6, \text{ }1.5, 1.8

a

Determine the five-number summary for this data set.

b

Another girl, Naylaa spends 3.6 hours using Hashtagram. If her score was added to this group, would it be considered an outlier?

32

For the box plot shown, find each of the following:

a

The lowest score.

b

The highest score.

c

The range.

d

The median.

e

The interquartile range.

0
2
4
6
8
10
12
14
16
18
20
33

For the box plot shown below, find the interquartile range.

0
10
20
30
40
50
60
70
80
90
34

Create a box plot to represent the data in the given table:

\text{Minimum}5
Q120
\text{Median}40
Q355
\text{Maximum}70
35

Two groups of people, athletes and non-athletes, had their resting heart rate measured. The results are displayed in the following pair of box plots.

a

Calculate the median heart rate of athletes.

b

Calculate the median heart rate of the non-athletes.

c

According to the median, which group has lower heart rates?

d

Calculate the interquartile range of the athletes' heart rates.

e

Calculate the interquartile range of the non-athletes' heart rates.

f

According to the interquartile range, which group has more consistent heart rate measures?

Athletes
40
50
60
70
80
90
Non-athletes
40
50
60
70
80
90
36

Consider the box plot shown:

a

Determine the percentage of scores that lie between the following:

i

7 and 15 inclusive

ii

1 and 7 inclusive

iii

19 and 9 inclusive

iv

7 and 19 inclusive

v

1 and 15 inclusive

b

In which quartile is the data the least spread out?

Scores
0
5
10
15
20
37

The glass windows for an airplane are cut to a certain thickness, but machine production means there is some variation. The thickness of each pane of glass produced is measured (in millimetres) and the results are shown in the following dot plot:

a

Determine the median thickness. Round your answer to two decimal places.

b

Determine the interquartile range.

c

Construct a box plot to represent the data.

d

What percentage of thicknesses were between 10.8 mm and 11.2 mm inclusive? Round your answer to two decimal places.

e

According to the box plot, in which quartile are the results the most spread out?

f

State whether the following can be determined from a box plot:

i

The mode thickness

ii

The frequency of each thickness

iii

The median thickness

iv

The spread of thicknesses

Standard deviation
38

Use the statistics mode on the calculator to determine the standard deviation of the following sets of scores. Round your answer to two decimal places.

a

- 17,\text{ } 2,\text{ } - 6 ,\text{ } 9,\text{ } - 17,\text{ } - 9,\text{ } 3,\text{ } 8,\text{ } 5

b

8, \text{ }20, \text{ }16, \text{ }9, \text{ }9, \text{ }15, \text{ }5, \text{ }17, \text{ }19, \text{ }6

39

The mean income of people in Finland is \$45\,000. This is the same as the mean income of people in Canada. The standard deviation of Finland is greater than the standard deviation of Canada. In which country is there likely to be the greatest difference between the incomes of the rich and poor?

40

The table shows the number of goals scored by a football team in each game of the year:

a

In how many games were 0 goals scored?

b

Determine the median number of goals scored. Round your answer to one decimal place.

c

Calculate the mean number of goals scored each game. Round your answer to two decimal places.

d

Use your calculator to find the standard deviation. Round your answer to two decimal places.

\text{Score }(x)\text{Frequency } (f)
03
11
25
31
45
55
41

Consider the histogram below:

a

Find the range of the data set.

b

Find the mean of the data set. Round your answer to two decimal places.

c

Find the population standard deviation. Round your answer to two decimal places.

42

Calculate the standard deviation for the following data represented by the frequency histogram. Round your answer to two decimal places.

43

The scores of five diving attempts by a professional diver are recorded below:

5.6,\text{ } 6.6,\text{ } 6.3,\text{ } 5.9,\text{ } 6.4

a

Calculate the standard deviation of the scores. Round your answer to two decimal places.

b

On the sixth dive, the diver scores 8.8. What affect will this score have on the mean and standard deviation?

44

Meteorologists predicted a huge variation in temperatures throughout the month of April. The temperature each day for the first two weeks of April were recorded as follows:

16,\text{ } 18,\text{ } 20.5,\text{ } 21,\text{ } 21,\text{ } 21, \text{ }21.5, \text{ }22, \text{ }22,\text{ } 24,\text{ } 24,\text{ } 25,\text{ } 26,\text{ } 27

a

State the range of the temperatures.

b

Calculate the interquartile range of the temperatures.

c

Use your calculator to determine the standard deviation. Round your answer to one decimal place.

d

Would the standard deviation or the interquartile range be the best measure of spread to support the prediction of a huge variation in temperatures?

Grouped data
45

Consider the following table:

a

Use the midpoint of each class interval to estimate the mean. Round your answer to one decimal place.

b

Which is the modal group of scores?

ScoreFrequency
1 - 520
6-1015
11 - 158
16 - 204
21 - 253
26 - 302
46

Consider the following table:

a

Use the midpoint of each class interval to estimate the mean. Round your answer to one decimal place.

b

Which is the modal group of scores?

\text{Score }(x)\text{Frequency}
0 \leq x < 204
20 \leq x < 4015
40 \leq x < 6023
60 \leq x < 8073
80 \leq x < 10045
47

Consider the following table:

a

Complete the table.

b

Calculate an estimate for the mean. Round your answer to two decimal places.

c

Calculate an estimate for the standard deviation. Round your answer to two decimal places.

d

If we used the original ungrouped data to calculate standard deviation, would the ungrouped data have a higher or lower standard deviation?

\text{Class}\text{Class centre}ffx
1 - 98
10 - 186
19 - 274
28 - 366
37 - 458
\text{Total}
Shape of data
48

State whether the data in each graph is positively skewed, negatively skewed or symmetrical:

a
b
c
d
e
Leaf
16\ 7\ 7
22\ 2\ 2\ 2\ 3\ 3\ 3
33\ 3\ 3\ 6\ 6\ 6\ 7\ 7\ 7\ 7\ 7
44\ 4\ 4\ 4\ 4\ 4
57\ 7

Key: 1 \vert 6 = 16

f
49

The following stem and leaf plot displays the ages of people who entered through the gates of a concert in the first 5 seconds:

a

Calculate the median age.

b

What was the difference between the lowest age and the median?

c

What is the difference between the highest age and the median?

d

Calculate the mean age. Round your answer to two decimal places.

e

Is the data positively or negatively skewed?

Age
10\ 0\ 1\ 1\ 2\ 2\ 4\ 7\ 9
22\ 2\ 5\ 6\ 7
31\ 4\ 8
43
54

Key: 1 \vert 6 = 16

50

VO_{2} Max is a measure of how efficiently your body uses oxygen during exercise. The more physically fit you are, the higher your VO_{2} Max. A group of people had their VO_{2} Max measured, the results are given below:

21,\text{ } 21,\text{ } 23,\text{ } 25,\text{ } 26,\text{ } 27,\text{ } 28,\text{ } 29,\text{ } 29,\text{ } 29,\text{ } 30,\text{ } 30,\text{ } 32,\text{ } 38,\text{ } 38,\text{ } 42,\text{ } 43,\text{ } 44,\text{ } 48,\text{ } 50,\text{ } 76

a

Determine the median VO_{2} Max.

b

Determine the upper quartile value.

c

Determine the lower quartile value.

d

Consider the box plot for this data set. Are the results positively or negatively skewed?

20
25
30
35
40
45
50
55
60
65
70
75
80
e

Calculate 1.5 \times IQR, where IQR is the interquartile range. Round your answer to two decimal places.

f

An outlier is a score that is more than 1.5 \times IQR above or below the Upper Quartile or Lower Quartile respectively. State the outlier.

Sign up to access Worksheet
Get full access to our content with a Mathspace account

Outcomes

3.1.1

review the statistical investigation process: identify a problem; pose a statistical question; collect or obtain data; analyse data; interpret and communicate results

What is Mathspace

About Mathspace