Data Analysis

Hong Kong

Stage 1 - Stage 3

Lesson

It is important to be able to compare data sets because it helps us make conclusions or judgments about the data. For example, say Jim got $\frac{5}{10}$510 in a geography test and $\frac{6}{10}$610 in a history test. Which test did he do better in? Just based on those marks, it makes sense to say he did better in history.

But what about if everyone else in his class got $\frac{4}{10}$410 in geography and $\frac{8}{10}$810 in history? If you had the highest score in the class in geography and the lowest score in the class in history, does it really make sense to say you did better in history?

By using the means of central tendency in a data set (that is, the mean, median and mode), as well as measure of spread, such as the range, we can make comparisons between different groups.

Let's look at an example of data that compares different variables.

Study the bar graph below which shows the changes in tourism rates in different cities during $2011$2011 and $2012$2012, then answer the following questions.

A) Which city had the highest percentage of tourism in $2011$2011?

Think: The blue lines represent the $2011$2011 tourism rates.

Do: Paris has the tallest blue line ($80%$80%) so Paris had the highest percentage of tourism in $2011$2011.

B) Which city had the lowest percentage of tourism in $2012$2012?

Think: The red lines represent the $2012$2012 tourism rates.

Do: Rome has the shortest red line (approximately $25%$25%) so Rome had the lowest tourism rates in $2012$2012.

C) Which city had the highest percentage tourism in a single year?

Think: The taller the column, the higher the tourism rate.

Do: New York had the highest percentage tourism in a single year ($90%$90% in $2012$2012).

D) Which city/cities had the lowest percentage tourism in a single year?

Think: The shorter the column, the lower the tourism rate.

Do: Tokyo and Rome had the lowest tourism rates (Tokyo had a $25%$25% in $2011$2011 and Rome had the same rate in $2012$2012).

E) How much higher is Paris's percentage of tourism in $2011$2011 than that of London in $2012$2012?

Think: We need to find the difference between the two percentages.

Do:

In $2011$2011, Paris' rate was $80%$80%.

In $2012$2012, London's rate was also $80%$80%.

Paris' $2011$2011 rate is $0%$0% higher than London's $2012$2012 rate. In other words they're exactly the same!

F) Paris' maximum percentage of tourism over the $2$2 years is higher than that of Istanbul by:

Think: What are the highest rates for both cities?

Do:

Paris' highest rate was $80%$80%.

Istanbul's highest rate was $60%$60%.

$80-60=20$80−60=20

Paris' highest tourism rate was $20%$20% higher than that of Istanbul.

We can also use data to predict, or make an educated guess about what will happen in the future. To make a prediction, we need to look at the way the data is trending.

Four friends collected some data about each other.

Name | Arm Span | Height | Month of Birth |
---|---|---|---|

Homer | $138$138cm | $141$141cm | January |

Ben | $129$129cm | $127$127cm | November |

Sally | $138$138cm | $138$138cm | February |

Sophia | $128$128cm | $133$133cm | November |

a) Construct a column graph depicting the students' arm spans.

Think: How do we convert the information from the table into a column graph?

Do:

b) Construct a column graph depicting the students' heights.

Think: This is just like the previous question, except we need to look at the height column this time.

Do:

c) Which two of the criteria is there a relationship between?

A) Arm span and height B) Height and birth month C) Birth month and arm span

Think: What information do our graphs show that will help us work out which two criteria there is a relationship between?

Do: There is a similar pattern in people's arm spans and heights. Therefore, we can say that height is always about the same as the arm span.

d) Which of the following statements are true?

A) Height is always greater than arm span B) Height is always less than arm span C) Height is always about the same as the arm span

Think: Which statement is true based on the relationship we found in part c?

Do: C) Height is always about the same as the arm span

e) Which of the following statements is true?

A) The later someone is born in a year, the shorter they will be. B) The month has no bearing on how tall someone is. C) If someone is born before June they will always be taller.

Think: Would the month someone was born actually affect how tall they are?

Do: The month you are born doesn't affect your height. So, we'd select B) The month has no bearing on how tall someone is.

In a zoo the most popular attractions are the Elephants and Zebras. Over $3$3 months the number of visitors (in hundreds) were recorded.

Months | Zebras | Elephants |
---|---|---|

May | $20$20 | $43$43 |

June | $23$23 | $38$38 |

July | $25$25 | $34$34 |

a) Which of the following is a reasonable prediction of the number of zebra visitors in August if the pattern was to continue?

A) $30$30 B) $24$24 C) $27$27

b) Which of the following is a reasonable prediction of the number (in hundreds) of elephant visitors in August if the pattern was to continue?

A) $31$31 B) $37$37 C) $25$25

c) Create a column graph to represent the number of zebra visitors each month.

d) Create a column graph to represent the number of elephant visitors each month.

e) If the trend continues, in what month will the two animals have roughly the same number of visitors?