topic badge

7.01 Bivariate data

Lesson

Bivariate data is the name for numerical data consisting of pairs of values. We generate these pairs to find out whether there is a simple relation between the numbers in each pair.

For example, we may conduct an experiment on a group of people where each person’s bone density is measured against their age. Their age is the input quantity and this could be any value. Their bone density is the level of response that is recorded against their age.

Then each person’s age and bone density make a pair of values in the bivariate data set.

The paired values in a bivariate data set are called the independent variable and the dependent variable. They may also be called the explanatory variable and the response variable. In the above context, the word independent is the person’s age and the dependent variable is their bone density. We could then check whether age is a good predictor for bone density. In other words, we could determine whether bone density depends on a person’s age.

 

Visualising bivariate data

A single data point in a bivariate data set is written in the form $\left(x,y\right)$(x,y), with the first number $x$x being the independent variable and the second number $y$y being the dependent variable. We display bivariate data graphically by plotting the data points with the value of the independent variable on the horizontal axis and the value of the dependent variable on the vertical axis. This is known as a scatterplot.

 

Worked example

Scientists want to see how quickly a plant grows under controlled conditions. They measure the height of the plant over 10 days and record the data in the table below.

Day $1$1 $2$2 $3$3 $4$4 $5$5 $6$6 $7$7 $8$8 $9$9 $10$10
Height (cm) $1.55$1.55 $2.32$2.32 $3.32$3.32 $4.51$4.51 $5.75$5.75 $6.91$6.91 $7.86$7.86 $8.58$8.58 $9.09$9.09 $9.43$9.43

 

Think: We are interested in what happens to the height as the number of days of growth increases. In other words, the height depends on the day. Alternately, the height responds to the day, or the day explains the height). So the day is the independent variable and the height is the dependent variable.

We can write these data points as ordered pairs, $\left(1,1.55\right),\left(2,2.32\right),\dots$(1,1.55),(2,2.32),

Do: Writing the data points as ordered pairs doubles as writing them as coordinates on a scatterplot. To make a scatterplot we plot each of the data points on a number plane.

For example, to plot the first data point, $\left(1,1.55\right)$(1,1.55) we plot the point where $x=1$x=1 and $y=1.55$y=1.55.

We do this for every data point and we have our finished scatterplot.

By creating a scatterplot using the ordered pairs, we can more easily see the relationship between the number of days of growth and the height of the plant.

Interpreting bivariate data

When we have bivariate data, we want to determine what sort of relationship the two variables have. Broadly speaking, there are three possibilities. 

(1) The dependent variable can increase as the independent variable increases.

(2) The dependent variable can decrease as the independent variable increases.

(3) The two variables may have some complicated relationship or no apparent relationship.

 

If the data points move from the bottom-left to the top-right of the scatterplot, then the dependent variable is increasing as the independent variable increases. If the data points move from the top-left to the bottom-right, then the dependent variable is decreasing as the independent variable increases. 


 

Causal relationships

Even when two variables have a relationship, it may not be a causal relationship. We cannot say for sure that a change in the value of $x$x causes $y$y to change or that the value of $y$y causes a corresponding value of $x$x even when a relation is apparent. It may be that both $x$x and $y$y have a relationship with some other hidden variable, which creates an indirect relationship between $x$x and $y$y.

Summary

Bivariate data - Data consisting of ordered pairs of two variables

Independent variable - A variable which is not determined by another variable. Also called an explanatory variable

Dependent variable - A variable which is determined by some other variable. Also called a response variable

Data point - A value or ordered pair taken from a data set

Scatterplot - A visualisation of bivariate data where ordered pairs are plotted on a number plane

 

Worked example

Using the scatterplot of the height of a plant over time, describe the relationship between the height of the plant and the number of days which have passed.

Think: Looking at the scatterplot, the data points move from the bottom-left to the top-right. That is, each day, the plant is higher than the previous day.

Do: The height of the plant increases as the number of days passed increases.

 

 

Practice questions

Question 1

Ten runners in a park record the distance they run and the amount of time they spend running.

Create a scatterplot with the data below.

Time (min) $28$28 $11$11 $37$37 $26$26 $43$43 $37$37 $41$41 $47$47 $19$19 $43$43
Distance (km) $9.3$9.3 $3.6$3.6 $10.8$10.8 $6.6$6.6 $13.8$13.8 $12$12 $11.7$11.7 $15$15 $6.3$6.3 $13.2$13.2
  1. Loading Graph...

Question 2

A cafe records the number of soups sold and the daily maximum temperature.

Describe the relationship between temperature and soups sold in the data.

Loading Graph...

  1. As temperature increases, soups sold decreases.

    A

    As temperature increases, soups sold increases.

    B

    Temperature has no effect on soups sold.

    C

Question 3

Soccer players measure the distance they can kick a ball at different angles and record the measurements in the table below.

Angle (°) $44$44 $24$24 $25$25 $69$69 $76$76 $66$66 $65$65 $26$26 $14$14 $37$37
Distance (m) $59$59 $33$33 $44$44 $34$34 $26$26 $40$40 $42$42 $45$45 $15$15 $45$45
  1. Which of the variables is independent and which is dependent?

    The independent variable is the distance and the dependent variable is the angle.

    A

    The independent variable is the angle and the dependent variable is the distance.

    B
  2. Create a scatterplot with the data.

    Loading Graph...

  3. What best describes the relationship in the data?

    As angle increases, distance increases.

    A

    As angle increases, distance decreases.

    B

    Angle has no effect on distance.

    C

 

Outcomes

MS2-12-2

analyses representations of data in order to make inferences, predictions and draw conclusions

MS2-12-7

solves problems requiring statistical processes, including the use of the normal distribution and the correlation of bivariate data

What is Mathspace

About Mathspace