topic badge

9.06 Changing data values and measures of center

Changing data values and measures of center

We have previosuly learned about various measures of center:

  • mean - also called average, is the sum of values divided by the number of values.
  • median - is the middle value when the values are sorted.
  • mode - the value that occurs most often.

Exploration

Slide the blue data point left and right to observe how the mean, median, and mode change.

Click the 'Remove additional data' checkbox to remove the blue data point or to add it back.

Click the 'New data set' button to test your observations with multiple sets of data.

Loading interactive...
  1. What happens to the measures of center when the blue point is removed?
  2. What happens to the measures of center when the blue point is added back?
  3. What happens to the measures of center when the blue point is changed?
  4. Repeat with new data sets to see how the measures of center change as the blue point is added, removed, or changed. Do your previous observations continue to be true?

Recall we can calculate the mean by finding the 'average' of the data set: \text{Mean}=\dfrac{\text{sum of values}}{\text{number of values}}

Since every data value in the set is a part of the sum, adding, removing, or changing a value can change the numerator significantly, depending on what that value is. While the denominator will only increase or decrease by 1 (or not at all if we've only changed an existing value). This is why the mean is so easily affected by changing the data.

To find the median, we list all the numbers in order from smallest to largest and find the middle value. Adding, removing, or changing a value in a data set can often change its median. Though this change will not be major because the data values are ordered numerically and changing a single value will only cause it to shift to a nearby value.

The mode is the value with the highest frequency (the one that appears most often). When we add, remove, or change a value in a data set, it may affect the mode by causing a new number to become the mode, or the mode may remain the same.

Examples

Example 1

Consider the data:

Scores: \{39,\,39,\,39,\,39,\,39,\,39,\,40,\,40,\,40,\,40,\,41,\,41,\,42,\,42,\,43,\,43,\,43,\,43\}

a

Find the total number of scores.

Worked Solution
Create a strategy

Add the total number of scores.

Apply the idea
\displaystyle \text{Total number of scores}\displaystyle =\displaystyle 6 + 4 + 2 + 2 + 4Add the total number of frequencies
\displaystyle =\displaystyle 18Evaluate the addition

There are 18 total scores.

b

Approximate the sum of the scores.

Worked Solution
Create a strategy

Add all the scores together.

Apply the idea
\displaystyle \text{Sum of the scores}\displaystyle =\displaystyle 39\cdot 6+40\cdot 4+41\cdot 2+42\cdot 2+43\cdot 4Multiply each score by its frequency
\displaystyle =\displaystyle 234 + 160 + 82 + 84 + 172Evaluate the multiplication
\displaystyle =\displaystyle 732Evaluate the addition

The sum of all the scores is 732

c

Find the mean, median, and mode of the scores, correct to two decimal places.

Worked Solution
Create a strategy

For the mean, we can use the formula: \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}

The mode is the most repeated score.

The median is the middle score.

Apply the idea
\displaystyle \text{Mean}\displaystyle =\displaystyle \dfrac{732}{18}Divide the sum of the scores by the total number of scores
\displaystyle =\displaystyle 40.67Evaluate to two decimal places

Mode:

The mode is the score with the highest frequency which is 39.

Median:

There are 18 scores. The median score should be the average of the 9th and 10th score. The 9th and 10th score are both 40, so the median is 40.

d

A new score of 10 is added. Find the new mean, median, and mode of the scores, correct to two decimal places.

Worked Solution
Create a strategy

For the mean, we can use the formula: \text{Mean}=\dfrac{\text{Sum of scores}}{\text{Number of scores}}

The mode is the most repeated score.

The median is the middle score.

Apply the idea
\displaystyle \text{Mean}\displaystyle =\displaystyle \dfrac{742}{19}Divide the sum of the scores by the total number of scores
\displaystyle =\displaystyle 39.05Evaluate to two decimal places

Mode:

The mode is the score with the highest frequency which is still 39.

Median:

There are 19 scores. The median score should be the average of the 10th score. The 10th score is 40, so the median is 40.

Reflect and check

Notice that the mode and median did not change. When much smaller or much larger data is added, it is more likely to have a greater impact on the mean since it measures the distance of each point from the balance point.

Example 2

A data set consists of five numbers 11,\, 13,\, 9,\, 13,\, 9.

a

The data set has a current mean of 11. If the data set changes to 11,\, 15,\, 9,\, 13,\, 9 will the mean be higher, lower, or remain the same?

Worked Solution
Create a strategy

Identify what has changed in the data set. One of the values of 13 has been increased to 15. Think about what increasing the sum of the data will do to the mean.

Apply the idea

Remember the mean is calculated by dividing the sum of all values by the number of values in the set.

Increasing a data value will increase the sum (numerator) but the number of data values (denominator) will stay the same. When we divide a larger numerator by the same denominator, the result we get will be larger than the original.

The mean will increase.

Reflect and check
\displaystyle \text{Mean}\displaystyle =\displaystyle \dfrac{11 + 15 + 9 + 13 + 9}{5}Find the sum of the data set
\displaystyle =\displaystyle \dfrac{57}{5}Divide the sum of the values by the total number of values
\displaystyle =\displaystyle 11.4Evaluate to two decimal places

The mean will be higher because the balance point is pulled up towards the changed, larger data value.

b

The data set has a current median of 11. If a new number is added that is larger than 13, will the median be higher, lower, or remain the same?

Worked Solution
Create a strategy

The median is the middle value in a data set. So adding a value could affect the location of the middle of the data set.

Apply the idea

If we arrange the original data set in ascending order, the data set looks like this: 9,\,9,\,11,\,13,\,13.

The median of this ordered set, which is the middle value when all the numbers are listed from smallest to largest, is 11.

If we add a number larger than 13, it is added to the right side of the ordered list of data and would look like: 9,\,9,\,11,\,13,\,13,\,⬚.

We can see the new median will fall between 11 and 13.

Without calculating the value of the median we can say that it will increase.

Reflect and check

To find the middle, organize the data from smallest to largest. 9,\,9,\,11,\,13,\,13,\,⬚

There are now 6 data values so the middle falls between the 3rd and 4th value. To find the middle average 11 and 13.

\displaystyle \text{Average}\displaystyle =\displaystyle \dfrac{11+13}{2}Find the sum of the the values being averaged
\displaystyle =\displaystyle \dfrac{24}{2}Find the quotient
\displaystyle =\displaystyle 12Evaluate
c

The current data set has two modes of 9 and 13. If the data set changes to 11,\, 9,\, 13,\, 9 will the modes remain the same?

Worked Solution
Create a strategy

Remember the mode represents the data value occurring with the highest frequency.

Apply the idea

In the original data set, the values of 9 and 13 occurred twice. The new data set removed one of the values of 13, so it now only occurs once and is no longer a mode. This means, 9 is the only mode of the new data set.

Example 3

25 students took an assessment. Their scores are shown below. 58,\,60,\,60,\,60,\,61,\,62,\,62,\,63,\,63,\,63,\,64,\,64,\,64,\,64,\,64,\,65,\,65,\,65,\,66,\,66,\,66,\,67,\,68,\,70,\,70

a

A teacher calculated the mean of 25 students’ scores to be 64. A student who later completed the assessment got a score of 55. Find the new mean of the class, correct to two decimal places.

Worked Solution
Create a strategy

We can use the formula: \text{Mean}=\dfrac{\text{Sum of all scores}}{\text{Total number of scores}}

Apply the idea
\displaystyle \text{New mean}\displaystyle =\displaystyle \dfrac{25 \cdot 64 + 55}{25 + 1}Substitute the values
\displaystyle =\displaystyle \dfrac{1655}{26}Evaluate the addtion and multiplication
\displaystyle =\displaystyle 63.65Evaluate to two decimal places

The new mean of the class is 63.65 which is 0.35 lower than the original mean of 64. This makes sense because the new score of 55 being added is a value far away from the previous mean, causing the mean score to drop.

Reflect and check

Notice the sum of all the scores was not recalculated because the old mean was already calculated. The old mean can be used in calculating the new mean as long as none of those previous data values change. The new score needs to be included in the new sum and the total number of scores needs to be increased by 1 to show that there are now a total of 26 scores in the data set.

b

Find the median of the class before and after the final student took the assessment. Did the median change?

Worked Solution
Create a strategy

The median is the middle value of a data set when ordered from least to greatest. So adding a value could affect the location of the middle of the data set.

Apply the idea

There were originally 25 students who took the assessment, making the median the 13th score.

The median of the original data set, before the final student took the assessment, is 64.

We can list out the new list of scores with the final student's score included. 55,\,58,\,60,\,60,\,60,\,61,\,62,\,62,\,63,\,63,\,63,\,64,\,64,\,64,\,64,\,64,\,65,\,65,\,65,\,66,\,66,\,66,\,67,68,\,70,\,70

Adding another score to the data set makes 26 scores so the median falls between the 12th and 13th score.

Both the 12th and 13th score is 64, making the median 64.

In this case the median did not change because the score of the middle value remained 64.

c

Find the mode of the class before and after the final student took the assessment. Did the mode change?

Worked Solution
Create a strategy

The mode represents the data value with the highest frequency. So we will need to count out the common scores and find the score that occurs most frequently.

Apply the idea

The new score of 55 only occurs 1 time, so it does not affect the mode before or after the final student took the assessment.

The scores of 60,\,63,\,65,\, and 66 have a frequency of 3.

The score of 64 has a frequency of 5 making it the score with the highest frequency.

Therefore, the mode of the class before and after the final student took the assessment is 64.

Idea summary

Adding a new data point can significantly impact the measures of center, which include the mean, median, and mode.

Mean:

  • The most sensitive to changing the data set
  • Increases if an existing value is increased, a value smaller than the mean is removed, or if a value that is larger than the mean is added to the set
  • Decreases if an existing value is decreased, if a value larger than the mean is removed, or if a value that is smaller than the mean is added to the set
  • Stays the same if a value equal to the mean is added or removed

Median and mode are not significantly impacted by changing a value in the data set.

Outcomes

6.PS.2

The student will represent the mean as a balance point and determine the effect on statistical measures when a data point is added, removed, or changed.

6.PS.2b

Determine the effect on measures of center when a single value of a data set is added, removed, or changed.

What is Mathspace

About Mathspace