 11.06 Bivariate data

Lesson

Bivariate data is the name for numerical data consisting of pairs of values. We generate these pairs to find out whether there is a simple relation between the numbers in each pair.

For example, we may conduct an experiment on a group of people where each person’s bone density is measured against their age. Their age is the input quantity and this could be any value. Their bone density is the level of response that is recorded against their age.

Then each person’s age and bone density make a pair of values in the bivariate data set.

Univariate data consists of only one numerical variable. A data set collecting just the heights of people, or the number of cats people own, are univariate as there is only one variable. Even when comparing the heights from two different classes this is univariate data as this is the same variable just for two different groups of people.

The paired values in a bivariate data set are called the independent variable and the dependent variable. In the above context, the independent variable is the person’s age and the dependent variable is their bone density. The dependent variable is the one that should change based on the independent variable. We could then check whether age is a good predictor for bone density. In other words, we could determine whether bone density depends on a person’s age.

Displaying bivariate data

A single data point in a bivariate data set is written in the form $\left(x,y\right)$(x,y), with the first number $x$x being the independent variable and the second number $y$y being the dependent variable. We display bivariate data graphically by plotting the data points with the value of the independent variable on the horizontal axis and the value of the dependent variable on the vertical axis. This is known as a scatterplot.

Worked example

Example 1

Scientists want to see how quickly a plant grows under controlled conditions. They measure the height of the plant over 10 days and record the data in the table below.

 Day Height (cm) $1$1 $2$2 $3$3 $4$4 $5$5 $6$6 $7$7 $8$8 $9$9 $10$10 $1.55$1.55 $2.32$2.32 $3.32$3.32 $4.51$4.51 $5.75$5.75 $6.91$6.91 $7.86$7.86 $8.58$8.58 $9.09$9.09 $9.43$9.43

Think: We are interested in what happens to the height as the number of days of growth increases. In other words, the height depends on the day. So the day is the independent variable and the height is the dependent variable.

We can write these data points as ordered pairs, $\left(1,1.55\right),\left(2,2.32\right),\dots$(1,1.55),(2,2.32),

Do: Writing the data points as ordered pairs doubles as writing them as coordinates on a scatterplot. To make a scatterplot we plot each of the data points on a number plane.

For example, to plot the first data point, $\left(1,1.55\right)$(1,1.55) we plot the point where $x=1$x=1 and $y=1.55$y=1.55. We do this for every data point and we have our finished scatterplot. Reflect: By creating a scatterplot using the ordered pairs, we can more easily see the relationship between the number of days of growth and the height of the plant. Looking at the scatterplot, the data points move from the bottom-left to the top-right. That is, each day, the plant is higher than the previous day. So we know that the height of the plant increases as the number of days passed increases (at least for the days we have seen so far).

Summary

Bivariate data - Data consisting of ordered pairs of two variables

Univariate data - Data with only one variable

Independent variable - A variable that is not determined by another variable.

Dependent variable - A variable that is determined by some other variable.

Data point - A value or ordered pair taken from a data set

Scatterplot - A visualisation of bivariate data where ordered pairs are plotted on a number plane

Practice questions

Question 1

Create a scatter plot for the set of data in the table.

 $x$x $y$y $1$1 $3$3 $5$5 $7$7 $9$9 $3$3 $7$7 $11$11 $15$15 $19$19

Question 2

Scientists were looking for a relationship between the number of hours of sleep we receive and the effect it has on our motor and process skills. Some subjects were asked to sleep for different amounts of time, and were all asked to undergo the same driving challenge in which their reaction time was measured. The table shows the results, which are to be presented as a scatter plot.

Amount of sleep (hours) Reaction time (seconds)
$9$9 $3$3
$6$6 $3.3$3.3
$4$4 $3.5$3.5
$10$10 $3$3
$3$3 $3.7$3.7
$7$7 $3.2$3.2
$2$2 $3.85$3.85
$5$5 $3.55$3.55
1. By moving the points, create a scatter plot for the observations in the table.

2. According to the results, which of the following is true of the relationship between amount of sleep and reaction time?

As the amount of sleep decreases, the reaction time decreases.

A

As sleeping time decreases, reaction time improves.

B

Sleeping for longer improves reaction time.

C

The amount of sleep has no effect on the reaction time.

D

As the amount of sleep decreases, the reaction time decreases.

A

As sleeping time decreases, reaction time improves.

B

Sleeping for longer improves reaction time.

C

The amount of sleep has no effect on the reaction time.

D

Question 3

The market price of bananas varies throughout the year. Each month, a consumer group compared the average quantity of bananas supplied by each producer to the average market price (per unit).

Supply (kg) Price (dollars)
$550$550 $15.25$15.25
$600$600 $14.75$14.75
$650$650 $14.75$14.75
$700$700 $14.75$14.75
$750$750 $14.25$14.25
$800$800 $14.00$14.00
$850$850 $13.75$13.75
$900$900 $13.25$13.25
$950$950 $13.50$13.50
$1000$1000 $13.25$13.25
1. Complete the scatter plot by adding the missing observations from the table.

2. Which best describes the relationship between the supply quantity and the market price of bananas?

Positive linear

A

No direct relationship

B

Negative linear

C

Positive linear

A

No direct relationship

B

Negative linear

C
3. According to this data, when would a supplier of bananas receive a higher price per banana?

When very few bananas are available to be sold.

A

When the supply of bananas increases.

B

When very few bananas are available to be sold.

A

When the supply of bananas increases.

B

Outcomes

VCMSP352

Use scatter plots to investigate and comment on relationships between two numerical variables