topic badge

7.06 Describe distributions

Worksheet
Symmetry and skew
1

The table shows the number of crime novels in a bookshop for different price ranges rounded off to the nearest \$ 5:

a

Plot this data in a histogram.

b

Describe the shape of the distribution of the data.

Price of crime novelFrequency
\$55
\$1010
\$1517
\$208
\$2517
\$3010
\$355
2

Describe the shape of the data in the following graphs:

a
b
c
d
Leaf
16\ 7\ 7
22\ 2\ 2\ 2\ 3\ 3\ 3
33\ 3\ 3\ 6\ 6\ 6\ 7\ 7\ 7\ 7\ 7
44\ 4\ 4\ 4\ 4\ 4
57\ 7

Key: 2 \vert 3 = 23

e
f
g
h
i
3

If a set of data is strongly positively skewed and the median is 70, what can we conclude about the mean?

Modality, clustering, outliers
4

Consider the stem and leaf plot below:

a

Are there any outliers? If so, state the value.

b

Is there any clustering of data? If so, in what interval?

c

What is the mode?

d

Describe the shape of the data.

Leaf
05
17\ 8
20\ 8
31\ 3\ 3\ 7\ 8\ 9
41\ 3\ 5\ 8\ 8\ 8
5
6
7
8
92

Key: 2 \vert 3 = 23

5

The number of hours worked per week by a group of people is represented in the following stem and leaf plot:

a

Are there any outliers? If so, state the value.

b

Is there any clustering of data? If so, in what interval?

c

State the mode(s).

Leaf
02
1
20\ 3\ 6\ 6
31\ 4\ 5\ 6\ 6\ 7
40\ 4\ 6\ 7\ 9
50

Key: 2 \vert 3 = 23

6

The shoe sizes of all the students in a class were measured and the data was presented in a bar graph.

a

Are there any outliers? If so, state the value.

b

Is there any clustering of data? If so, in what interval?

c

What is the modal shoe size?

d

Describe the shape of the distribution.

7

Consider the dot Plot below:

a

Are there any outliers?

b

Is there any clustering of data?

c

State the modal score(s).

d

Describe the shape of the distribution of the data.

8

Consider the data shown in the histogram:

a

Are there any outliers? If so, what is the value?

b

Is there any clustering of data? If so, in what interval?

c

What is the mode?

d

Describe the shape of the distribution of the data.

9

Temperatures were recorded over a period of time and presented as a dot plot:

a

Are there any outliers?

b

Is there any clustering of data? If so, in what interval?

c

What is the modal temperature?

d

Describe the shape of the distribution of the data.

10

Consider the histogram given:

a

Describe the shape of the distribution

b

Determine the lower quartile score and the upper quartile Score.

c

Hence, calculate the interquartile range.

d

Using the interquartile range, determine whether there are any outliers in the data set.

11

Consider the given dot plot:

a

Describe the shape of the distribution.

b

Determine the lower quartile score and the upper quartile score.

c

Hence, calculate the interquartile range.

d

Using the interquartile range, determine whether there are any outliers in the data set. If there are, find the value of the outlier(s).

12

The stem and leaf plot below shows the age of people to enter through the gates of a concert in the first 5 seconds:

a

What was the median age?

b

What was the difference between the lowest age and the median?

c

What is the difference between the highest age and the median?

d

What was the mean age? Round your answer to two decimal places.

e

Is the data positively or negatively skewed?

Leaf
10\ 1\ 2\ 2\ 3\ 3\ 4\ 4\ 4\ 8\ 8\ 8
21\ 7
34\ 5\ 5
40
54

Key: 1 | 2 \ = \ 12 years old

13

Consider the histogram representing students' heights in centimetres:

a

Does the histogram most likely represent grouped data or individual scores?

b

Estimate the value of the mean to one decimal place.

14

Estimate the value of the mean of the following data set:

15

Consider the following set of scores:

67,\, 55,\, 74,\, 59,\, 58,\, 60,\, 62,\, 74,\, 59,\, 62,\, 66,\, 68

a

Complete the grouped frequency table.

b

What is the modal class?

Class intervalFrequency
51-55
56-60
61-65
66-70
71-75
\text{Total}
16

Consider the following set of scores:

42,\, 42,\, 49,\, 49,\, 49,\, 55,\, 55,\, 55,\, 55,\, 55,\, 58,\, 58,\, 64

a

Complete the grouped frequency table.

b

What is the modal class?

Class intervalFrequency
41-45
46-50
51-55
56-60
61-65
\text{Total}
17

Consider the following set of scores:

12,\, 18,\, 18,\, 23,\, 23,\, 23,\, 28,\, 28,\, 34,\, 34,\, 34,\, 34

a

Complete the grouped frequency table.

b

What is the modal class?

Class intervalFrequency
11-15
16-20
21-25
26-30
31-35
\text{Total}
18

The masses (rounded to the nearest kg) of a group of students are listed below:

65,\, 62,\, 70,\, 60,\, 64,\, 70,\, 64,\, 72,\, 72,\, 62,\, 72,\, 65,\, 69,\, \\71,\, 66,\, 69,\, 66,\, 66,\, 72,\, 60,\, 65,\, 69,\, 63,\, 64,\, 66,\, 70

a

Complete the following frequency table:

Class interval (kg)Class centreFrequency
60-64
65-69
70-74
\text{Total:}
b

Which is the modal class?

19

The mercury levels in 38 fishing lakes were tested and recorded in the histogram below:

a

Is the distribution uni-modal, bi-modal, or multi-modal?

b

State the modal class.

c

Complete the frequency distribution table:

\text{Score}\text{Class Centre}f
90\leq x <100
100\leq x <110
110\leq x< 120
120\leq x <130
130 \leq x< 140
20

The percentage of faulty computer chips in 42 batches were recorded in the histogram given:

a

Is the distribution uni-modal, bi-modal, or multi-modal?

b

State the modal classes.

21

The temperature in a classroom at 1pm every day was measured and recorded in the histogram below:

a

Is the distribution uni-modal, bi-modal, or multi-modal?

b

State the modal classes.

c

Complete the frequency distribution table:

\text{Score}\text{Class Centre}f
0\leq x <1
1\leq x <2
2\leq x <3
3\leq x <4
4\leq x <5
5\leq x <6
22

The number of peanuts in mixed nut packets were sampled and recorded in the following stem plot:

a

Complete the frequency distribution table given below:

\text{Score}\text{Class Centre}f
40-49
50-59
60-69
70-79
80-89
90-99
100-109
110-119
Leaf
43\ 6\ 8
51\ 2\ 2
66\ 7\ 7\ 8\ 8
70\ 0\ 3\ 3\ 4\ 5\ 9
81\ 1\ 1\ 4\ 6\ 8\ 8\ 9
90\ 2\ 4\ 5\ 6\ 9
101\ 2\ 3\ 5\ 5\ 6\ 7\ 8
110\ 4\ 5\ 7

Key: 2 \vert 3 = 23

b

Is the distribution Uni-modal, Bi-modal, or Multi-modal?

c

State the modal classes.

23

The reaction time of drivers was tested and recorded in the dot plot below:

a

Complete a frequency distribution table for the individual data values.

b

Is the distribution Uni-modal, Bi-modal, or Multi-modal?

c

How many modes are there?

24

Identify any outliers in each of the following data sets:

a
73,\, 77,\, 81,\, 86,\, 131
b
7,\, 25,\, 28,\, 35,\, 42
c
69,\, 79,\, 86,\, 72,\, 86,\, 77,\, 73,\, 82,\, 81,\, 76,\, 83,\, 47,\, 87,\, 70,\, 80,\, 85
d
58,\, 63,\, 58,\, 59,\, 64,\, 68,\, 68,\, 30,\, 73,\, 25,\, 72,\, 61,\, 65,\, 69,\, 75,\, 72
e
Leaf
12
27\ 7\ 9\ 9
31\ 3\ 3\ 3\ 3\ 5\ 8
44\ 4\ 5

Key: 5 | 2 \ = \ 52 hours

f
25

The table shows the average temperature (\degree \text{C}) in a particular city over several years. Identify the year(s) in which the temperature is an outlier.

Year2002200320042005200620072008200920102011
Temperature (°C)31.726.522.622.524.223.024.121.123.326.0
26

Use your CAS calculator to identify the outlier in the following data sets:

a

35, 48, 46, 29, 40, 31, 42, 27, 22, 20, 42, 28, 86

b

31, 49, 33, 39, 45, 50, 34, 20, 30, 31, 26, 31, 4

27

For each of the following data sets, calculate:

i

The interquartile range

ii

The value of the lower fence

iii

The value of the upper fence

a
\text{Minimum}5
\text{Q}16
\text{Median}12
\text{Q}317
\text{Maximum}28
b
2
4
6
8
10
12
14
16
18
28

For each of the following sets of data:

i

Construct the five-number summary.

ii

Calculate the interquartile range.

iii

Calculate the value of the lower fence.

iv

Calculate the value of the upper fence.

v

Would the value -5 be considered an outlier?

vi

Would the value 16 be considered an outlier?

a

9,\, 5,\, 3,\, 2,\, 6,\, 1

b

3,\, 10,\, 9,\, 2,\, 7,\, 5,\, 6

c

12,\, 5,\, 11,\, 1,\, 9,\, 8,\, 5,\, 6

29

For each of the following sets of data:

i

Construct the five-number summary.

ii

Would the value -3 be considered an outlier?

iii

Would the value 15 be considered an outlier?

a

1,\, 4,\, 8,\, 10,\, 6,\, 2,\, 5

b

9,\, 4,\, 6,\, 11,\, 10,\, 8,\, 10

30

For each of the data sets below:

i

Construct the five-number summary.

ii

Calculate the value of the lower fence.

iii

Calculate the value of the upper fence.

iv

Identify any outliers.

v

Create a box plot of the data with the outlier(s) displayed separately.

a
6.8,\, 4.0,\, 3.5,\, 5.1,\, 2.4,\, 1.6,\, 3.9,\, 3.5,\, 3.1,\, 3.6,\, 7.6,\, 3.7,\, 4.0,\, 5.1,\, 3.6,\, 3.8,\, 3.6,\, 6.7
b
10,\, 15,\, 12,\, 26,\, 18,\, 15,\, 11,\, 38,\, 25,\, 12,\, 19,\, 17,\, 16,\, 17,\, 11,\, 36,\, 9,\, 2,\, 21,\, 18,\, 16
c
82,\, 87,\, 92,\, 76,\, 80,\, 85,\, 71,\, 84,\, 61,\, 79,\, 81,\, 81,\, 86,\, 97,\, 101,\, 80,\, 71,\, 76,\, 78,\, 86,\, 84
31

Consider the data sets below:

  • Set A: \, 14,\, 18,\, 21,\, 19,\, 12,\, 16,\, 22,\, 20,\, 19,\, 13,\, 21,\, 20,\, 16,\, 7,\, 18,\, 20,\, 11,\, 19,\, 17,\, 24

  • Set B: \, 17,\, 9,\, 15,\, 24,\, 14,\, 13,\, 16,\, 10,\, 21,\, 14,\, 15,\, 17,\, 16,\, 13,\, 9,\, 19,\, 14,\, 18,\, 15,\, 12

a

Construct the five-number summary for each set.

b

Identify any outliers and use statistical calculations to justify your answer.

c

Create a parallel box plot of the data sets with the outlier(s) displayed separately.

32

The data point 5 is below the lower fence and is considered an outlier. The interquartile range is 12.

Find the smallest integer value the lower quartile can be.

33

The data point 37 is above the upper fence and is considered an outlier. The interquartile range is 10.

Find the largest integer value the upper quartile can be.

34

A group in a study take a test to assess their reaction time. The participants clicked a button as soon as they heard a sound which was played at random intervals. The reaction time, in milliseconds, of each participant is shown below:

220,\, 280,\, 210,\, 220,\, 215,\, 180,\, 185,\, 190,\, 190,\, 195,\, 150 \, 190,\, 195,\, 195
a

Construct the five-number summary.

b

Identify any outliers and use statistical calculations to justify your answer.

c

Create a box plot of the data with the outlier displayed separately.

d

Give a possible explanation for the outliers present.

35

\text{VO}_{2} Max is a measure of how efficiently your body uses oxygen during exercise. The more physically fit you are, the higher your \text{VO}_{2} Max.

Here are some people’s results when their \text{VO}_{2} Max was measured:

46,\, 27,\, 32,\, 46,\, 30,\, 25,\, 41,\, 24,\, 26,\, 29,\, 21,\, 21,\, 26,\, 47,\, 21,\, 30,\, 41,\, 26,\, 28,\, 26,\, 76

a

Sort the values into ascending order.

b

Determine the median \text{VO}_{2} Max.

c

Determine the upper quartile value.

d

Determine the lower quartile value.

e

Calculate 1.5 \times IQR, where IQR is the interquartile range.

f

Identify any outliers using upper and lower fences.

g

Create a box plot of the data with the outlier displayed separately.

h

An average untrained healthy person has a \text{VO}_{2} Max between 30 and 40.

Using the boxplot, what level of exercise is likely to describe the majority of people in this group?

Effects of outliers
36

The number of three-pointers scored in a basketball game are shown in the dot plot below. The mode is 2. If the outlier is removed what is the new mode?

37

Consider the given dot plot. The current median is 3. If the outlier is removed what is the new range?

38

Consider the given stem plot:

If the outlier is removed what is the new mean? Round your answer to two decimal places.

Leaf
34\ 4\ 9
46\ 6\ 8\ 9
51\ 4
6
7
84

Key: 2 \vert 3 = 23

39

Consider the given frequency table. If the outlier is removed what is the new mode?

Weight in kilogramsFrequency
141
150
160
173
186
194
202
40

The glass windows for an airplane are rolled to a certain thickness, but machine production means there is some variation. The thickness of each pane of glass produced is measured (in millimetres), and the dot plot shows the results.

a

The current median is 11.15. If the outlier is removed what is the new median?

b

The current mean is 11.1. If the outlier is removed what is the new mean? Round your answer to two decimal places.

41

For each of the following sets of data:

i

Find the mean, median, mode, and range. Round your answers to two decimal places where necessary.

ii

Identify the outlier.

iii

Remove the outlier from the set and recalculate the values found in part (i).

iv

Describe how each of the four statistics changed after removing the outlier.

a
53, \, 46,\, 25,\, 50,\, 30,\, 30,\, 40,\, 30,\, 47,\, 109
b
4.7,\, 2.8,\, 1.9,\, 0.9,\, 0.9,\, 2.2,\, 2.2,\, 1.2,\, 1.5,\, 0.9
c
4700,\, 4700,\, 4700,\, 4500,\, 5300,\, 4900,\, 5200,\, 4800,\, 1500,\, 5100
42

If an outlier is removed from a data set, describe the effect this has on the:

a

Mode

b

Range

c

Mean

d

Median

43

The selling price of recently sold houses are:

\$467\,000, \$413\,000, \$410\,000, \$456\,000, \$487\,000, \$929\,000

a

Find the mean selling price, to the nearest thousand dollars.

b

Which of the selling prices raises the mean so that it is not reflective of most of the prices?

c

Recalculate the mean selling price excluding this outlier.

44

Consider the given stem and leaf plot. If the outlier is removed find the new range.

Leaf
25
3
49\ 9
50\ 0\ 4\ 5\ 7
62\ 6

Key: 1 | 2 \ = \ 12

45

Consider the following frequency table:

If the outlier is removed what is the new mean? Leave your answer to two decimal places if needed.

Weight in kilogramsFrequency
122
135
141
152
160
170
181
46

The selling price of recently sold houses is given below:

\$760\,000,\, \$650\,000,\, \$810\,000,\, \$780\,000,\, \$760\,000,\, \$590\,000,\, \$1\,360\,000

a

Find the mean selling price. Round your answer to the nearest thousand dollars.

b

Find the median selling price.

c

Recalculate the mean selling price excluding the outlier.

d

Recalculate the median selling price excluding the outlier.

e

Which measure of centre best identifies the typical selling price of recently sold houses? Explain your answer.

47

The selling prices of artworks sold at an auction are given below:

\$18\,000,\, \$11\,000,\, \$17\,000,\, \$20\,000,\, \$18\,000,\, \$16\,000,\, \$15\,000,\, \$218\,000

a

Find the mean selling price to the nearest hundred dollars.

b

Find the median selling price.

c

Recalculate the mean selling price excluding the outlier. Round your answer to the nearest hundred dollars.

d

Recalculate the median selling price excluding the outlier.

e

Which measure of centre best identifies the typical selling price of recently sold artwork? Explain your answer.

48

The weight of fish caught in a "weigh and release" fishing competition, in kilograms are given below:

12.5,\, 15.1,\, 13,\, 14.2,\, 14.5,\, 14.9,\, 12.5,\, 14.3,\, 1.5

a

Find the mean weight.

b

Find the median weight.

c

Recalculate the mean weight excluding the outlier.

d

Recalculate the median weight excluding the outlier.

e

Which measure of centre best identifies the typical fish weight? Explain your answer.

Sign up to access Worksheet
Get full access to our content with a Mathspace account

Outcomes

2.1.4

with the aid of an appropriate graphical display (chosen from dot plot, stem plot, bar chart or histogram), describe the distribution of a numerical data set in terms of modality (uni or multimodal), shape (symmetric versus positively or negatively skewed), location and spread and outliers, and interpret this information in the context of the data

What is Mathspace

About Mathspace