topic badge

9.05 Associations in bivariate data

Lesson

Concept summary

When looking at bivariate data, it can often appear that the two variables are correlated.

Correlation

A relationship between two variables.

It is important to be able to distinguish between causal relationships (when changes in one variable cause changes in the other variable) and non-causal relationships.

Causation

A relationship between two events where one event causes the other.

To do so, we make sure to look at all aspects of an association between two variabels.

Association

A way to describe the form, direction or strength of the relationship between the two variables in a bivariate data set.

For categorical data, we can describe an association as positive or negative, as well as whether the association is strong or weak (or if there is no association).

There are more ways to describe an association for numerical data, since we can measure and perform calculations on this type of data. Descriptions can include whether a relationship is linear or nonlinear, whether a relationship is positive or negative (whether one variable is increasing or decreasing compared to the other), as well as whether an association is strong or weak (or if there is no association).

Worked examples

Example 1

Determine whether the following statement is true or false:

"There is a causal relationship between number of cigarettes a person smokes and their life expectancy"

Approach

It is generally understood that smoking cigarettes can cause disease such as cancer, which is know to have an effect on life expectancy.

Solution

True

Reflection

The evidence of a causal relationship usually comes from generally accepted truths or verified research studies. A causal relationship is not confirmed by finding an association between variables.

Example 2

Mayra surveyed some people on their way into Adventure Island Water Park. She asked the participants whether they arrived in a carpool or drove alone to the park, and whether they preferred the Lazy River or Raging Rapids.

Lazy RiverRaging RapidsTotal
Carpooled to water park251641
Drove alone to water park252045
Total503686

Describe the association, if any, between how people arrived to the water park and their preferred water park attraction activity. Justify your response.

Approach

The ratio of values in each column of the table can indicate the type of association (either positive or negative) and the strength of the association (weak, strong, or no association).

Solution

The ratio of joint frequencies for transportion of people who prefer the Lazy River is 1:1 and the ratio of joint frequencies for people who prefer Raging Rapids is 4:5 indicating that there is a weak positive association between carpooling to the water park and having a preference for the Lazy River.

Outcomes

MA.912.AR.2.2

Write a linear two-variable equation to represent the relationship between two quantities from a graph, a written description or a table of values within a mathematical or real-world context.

MA.912.DP.1.3

Explain the difference between correlation and causation in the contexts of both numerical and categorical data.

MA.912.DP.3.1

Construct a two-way frequency table summarizing bivariate categorical data. Interpret joint and marginal frequencies and determine possible associations in terms of a real-world context.

What is Mathspace

About Mathspace