A Pareto chart is used to identify the most significant factors in a set of categorical data. The chart combines a column graph and a line graph, and has two vertical axes, one for each graph type.
Below is a Pareto chart showing some of the common reasons for failing a driving test.
We can see from the chart above that the most common reasons for failing a driving test are:
If we draw a line from the $80%$80% mark on the right vertical axis to the line graph, and then continue that line down to the horizontal axis, the most important factors appear on the left of the line. In this case, it is the first two factors (represented by the first two points on line graph) that contribute to the majority of driving test failures. One could argue that the third factor, 'inappropriate speed' is also significant, due its closeness to the $80%$80% mark.
Pareto charts are based on something called the Pareto principle, which says that around $80%$80% of problems in a process tend to come from only $20%$20% of factors. While these percentages are only a guide, they are common enough to be called the $80$80/$20$20 rule.
Process improvement teams often use Pareto charts to determine which factors in a process are causing the most problems, so they can focus their efforts on those. This is an important part of quality control, often used to improve customer service or reduce the number of defects in a product.
A hotel collected data on customer complaints over the course of a month and organised the data into a frequency table:
Type of complaint | Number of complaints |
---|---|
Reservation wait time | $33$33 |
Room cleanliness | $19$19 |
Room service time | $11$11 |
Staff attitude | $9$9 |
Noise level | $5$5 |
Other | $3$3 |
The hotel wants to display this data in a Pareto chart and present it to staff so that the most significant complaints can be addressed.
The first step is to add two additional columns to the frequency table: one for cumulative frequency and the other for cumulative percentage.
The cumulative percentage is found by dividing each cumulative frequency by the total number of complaints, then multiplying by $100$100. For example, in the first row, the cumulative percentage is $\frac{33}{80}\times100=41.3%$3380×100=41.3% to $1$1 decimal place.
Type of complaint | Number of complaints | Cumulative frequency | Cumulative percentage |
---|---|---|---|
Reservation wait time | $33$33 | $33$33 | $41.3%$41.3% |
Room cleanliness | $19$19 | $52$52 | $65.0%$65.0% |
Room service time | $11$11 | $63$63 | $78.8%$78.8% |
Staff attitude | $9$9 | $72$72 | $90.0%$90.0% |
Noise level | $5$5 | $77$77 | $96.3%$96.3% |
Other | $3$3 | $80$80 | $100.0%$100.0% |
TOTAL | $80$80 |
The Pareto chart is named after Vilfredo Pareto (1848-1923), an Italian engineer, economist and political scientist. He came up with the Pareto principal (or $80$80/$20$20 Rule), after observing that $80%$80% of the wealth and land in Italy was owned by $20%$20% of the population. His $80$80/$20$20 rule happens to be true in many other situations. For example:
The following table was used in a vehicle service centre to determine the main causes of engine overheating.
A Pareto chart is to be constructed from this information.
Cause | Frequency | Cumulative frequency | Cumulative percentage |
---|---|---|---|
Damaged radiator core | $31$31 | $31$31 | $44$44 |
Faulty fans | $20$20 | $51$51 | $72$72 |
Faulty thermostat | $8$8 | $59$59 | $83$83 |
Loose fan belt | $5$5 | $64$64 | $90$90 |
Damaged radiator fins | $4$4 | $68$68 | $96$96 |
Coolant leakage | $3$3 | $71$71 | $100$100 |
Total | $71$71 |
Which column in the table is used to create the vertical bars in the column graph?
Frequency
Cumulative frequency
Cumulative percentage
Which column in the table is used to create the line graph?
Frequency
Cumulative frequency
Cumulative percentage
At Pareto's Burritos, the owners regularly ask their customers if and why they are not happy with their burritos.
They created a chart for last month's feedback.
How many customers in total expressed dissatisfaction last month? You can assume that the bars are in line with the labels on the left-hand $y$y axis, or exactly halfway between two labels.
Using the bar section of the Pareto chart, find the percentage of customer complaints made up by the three most frequent complaints.
Round your answer to the nearest percentage.
Pareto wants to significantly improve customer satisfaction in the next month. What single change would improve customer satisfaction the most?
Increasing the speed of service.
Lowering the price of burritos.
Adding more guacamole.
Using fresher ingredients.
What percentage of customer complaints would be resolved by reviewing how their chefs make their burritos? You can assume that the bars are in line with the labels on the left-hand $y$y axis, or exactly halfway between labels.
Round your answer to the nearest percentage.
Bill caught the train and noted what activity each person in his carriage (excluding himself) was doing between the next two stops. The Pareto chart shows the results.
How many other people were in the carriage? You can assume that each bar is either in line with a tick on the left-hand $y$y-axis, or exactly halfway between ticks.
Using the bar section of the Pareto chart, find the percentage of people on the carriage (excluding Bill) that make up the three most common activities. You can assume that each bar is either in line with a tick on the left-hand $y$y-axis, or exactly halfway between ticks.
Round your answer to the nearest percentage.