In statistics, we tend to assume that our data will fit some kind of trend and that most things will fit into a "normal" range. This is why we look at measures of central tendency, such as the mean, median and mode .
An outlier is an event that is very different from the norm and results in a score that is really above or below average. For example, if there are five people in a group and four people were between 120 \text{ cm} and 130 \text{ cm}, whereas Jim was 165 \text{ cm}, Jim would be an outlier as he is much taller than everyone else in the group.
The dot plot shows the temperature (\degree \text{C}) in a town over a several week period. Identify the temperature that is an outlier.
An outlier is an event that is very different from the norm and results in a score that is really above or below average.
Outliers can skew or change the shape of our data. This can be a problem (especially for small data sets) because the mean, median and range might not properly represent the situation. We can counteract this by removing outliers. Removing outliers will have the following effects:
Removing a really low outlier | Removing a really high outlier |
---|---|
The range will decrease. | The range will decrease. |
The median might increase. | The median might decrease. |
The mean will increase. | The mean will decrease. |
The mode will not change. | The mode will not change. |
Consider the following set of data: 53,\,46,\,25,\,50,\,30,\,30,\,40,\,30,\,47,\,109
Find the mean, median, mode, and range.
Which data value is an outlier?
Find the mean, median, mode, and range after removing the outlier.
Let A be the original data set and B be the data set without the outlier.
Complete the table using the symbols >,< and = to compare the statistics before and after removing the outlier.
\text{With outlier} | \text{Without\ outlier} | ||
Mean: | A | ⬚ | B |
Median: | A | ⬚ | B |
Mode: | A | ⬚ | B |
Removing outliers will have the following effects on the summary statistics:
A really low outlier | A really high outlier |
---|---|
The range will decrease | The range will decrease |
The median might increase | The median might decrease |
The mean will increase | The mean will decrease |
The mode will not change | The mode will not change |