topic badge

6.05 Comparing data distributions

Lesson

Concept summary

When we look at a representation of data in the form of a density curve, there are certain characteristics that can help us understand the corresponding data set.

Center

The center of a data set describes the entire data set with a single number. This is found by identifying the middle point of the data set.

x
y
Spread

The spread of a data set describes how varied or similar the data is. The spread of a density curve can be determined by the width of the curve at the x-axis.

x
y
Shape

The shape of a data set can be determined by looking at the outline of the curve and describes the distribution of data within the set.

Identifying the shape of a density curve can help us understand the corresponding data set. Here are some examples of how we describe the shape of density curves:

Symmetric

The data set is distributed around the center with similar frequency on the left and right.

x
y
Left skew

The majority of the data points have higher values, with some data points at lower values.

x
y
Right skew

The majority of the data points have lower values, with some data points at higher values.

x
y
Uniform

The data set is evenly distributed across all values.

x
y

Worked examples

Example 1

The following curves show the average math test results for two different classes. Curves 1 and 2 show the results for class 1 and 2 respectively.

a

State the similarities and differences between the following pair of density curves.

Approach

To compare and contrast the two curves, we can look to the shapes, centers, and spreads of each curve.

Solution

The shape of curve 1 is skewed right and the shape of curve 2 is skewed left, so the shapes are both skewed, but in opposite directions.

The mean of curve 1 will be above the peak due to the skew, so will be around 65\%. The mean of curve 2 will be below the peak due to the skew, so will be around 85\%. Their centers are quite different.

The spread of curve 1 goes from about 40 to 90\% and the spread of curve 2 goes from about 55 to 100\%. Therefore we can say the spreads are over different percentages, but are about the same size.

b

Interpret the test results of class 1 and class 2.

Approach

We can use the findings from part (a) in order to draw conclusions about the test results.

Solution

Class 1 has lower test results than class 2 on average and has a large spread of results. Class 2 has a much higher average test score than class 1, but also has a fairly large spread.

Example 2

Sketch a density curve that matches the following description:

  • Is skewed right
  • Has a secondary small peak in the middle of the range
  • Has a large spread

Approach

We need to ensure that we meet each of the provided criteria. It may be helpful to sketch each criteria in the order that they are presented.

Solution

Here is a possible answer:

Reflection

There are many possible solutions to this problem. As long as the 3 criteria are met, then we have a valid solution.

The question asks for a density curve, but other data displays like stem plots, dot plots, or histograms can show such a data set. We would not be able to see the secondary smaller peak on a box plot, but could show the other two criteria.

Outcomes

M3.N.Q.A.1

Use units as a way to understand real-world problems.*

M3.N.Q.A.1.D

Choose an appropriate level of accuracy when reporting quantities.

M3.S.ID.A.2

Use statistics appropriate to the shape of the data distribution to compare center (mean, median, and/or mode) and spread (range, standard deviation) of two or more different data sets.*

M3.MP2

Reason abstractly and quantitatively.

M3.MP3

Construct viable arguments and critique the reasoning of others.

M3.MP6

Attend to precision.

M3.MP8

Look for and express regularity in repeated reasoning.

What is Mathspace

About Mathspace