topic badge
AustraliaVIC
VCE 11 General 2023

7.02 Scatterplots

Lesson

When given bivariate data as a table of values, a scatterplot can be created to graph the data, where the explanatory variable is shown on the horizontal axis and the response variable is shown on the vertical axis. In this way, each data point is displayed as a point in a two-dimensional coordinate system.

Graphs of scatterplots and linear graphs were previously covered in Chapter 9. Select the brand of calculator you use below to work through an example of using a calculator to create a scatterplot.

Casio Classpad

How to use the CASIO Classpad to generate a scatterplot for a set of data.

The average number of pages read to a child each day and the child’s growing vocabulary are measured. Consider the data set given below:

Pages read per day ($x$x) $25$25 $27$27 $29$29 $3$3 $13$13 $31$31 $18$18 $29$29 $29$29 $5$5
Total vocabulary ($y$y) $402$402 $440$440 $467$467 $76$76 $220$220 $487$487 $295$295 $457$457 $460$460 $106$106
  1. Use your calculator to generate a scatterplot of the data.

TI Nspire

How to use the TI Nspire to generate a scatterplot for a set of data.

The average number of pages read to a child each day and the child’s growing vocabulary are measured. Consider the data set given below:

Pages read per day ($x$x) $25$25 $27$27 $29$29 $3$3 $13$13 $31$31 $18$18 $29$29 $29$29 $5$5
Total vocabulary ($y$y) $402$402 $440$440 $467$467 $76$76 $220$220 $487$487 $295$295 $457$457 $460$460 $106$106
  1. Use your calculator to generate a scatterplot of the data.

 

Practice question

Question 1

Create a scatter plot for the set of data in the table.

$x$x $1$1 $3$3 $5$5 $7$7 $9$9
$y$y $3$3 $7$7 $11$11 $15$15 $19$19
  1. Loading Graph...

 

An association between two variables is known as a correlation. A correlation may (or may not) signify a relationship between two variables. To identify any correlation between the two variables, there are three things to focus on when analysing a scatterplot:
  • Direction
  • Form
  • Strength

 

Direction

The direction of the scatterplot refers to the pattern shown by the data points. The direction of the pattern can be described as having positive correlation, negative correlation or no correlation:

  • Positive correlation
    • A positive correlation occurs when the RV increases as the EV increases.
    • From a graphical perspective this occurs when the $y$y-coordinate increases as the $x$x-coordinate increases, which is similar to a line with a positive gradient.
  • Negative correlation
    • A negative correlation occurs when the RV decreases as the EV increases.
    • From a graphical perspective this occurs when the $y$y-coordinate decreases as the $x$x-coordinate increases, which is similar to a line with a negative gradient.
  • No correlation
    • No correlation describes a data set which has no relationship between the variables.
    • This can come in the form of totally unrelated data, or data that indicates no change of RV as the EV changes (like a horizontal straight line, which has zero gradient).

 

Form

The form of a scatterplot refers to the type of relationship the two variables may appear to share. For example, if the data points lie on or close to a straight line, the scatterplot has a linear form.

Forms other than a line may be apparent in a scatterplot. If the data points lie on or close to a curve, it may be appropriate to infer a non-linear form between the variables.

 

Strength

The strength of a linear correlation relates to how closely the points reassemble a straight line.

  • If the points lie exactly on a straight line, then there is a perfect correlation.
  • If the points are scattered randomly, then there is no correlation.

Most scatterplots will fall somewhere in between these two extremes, and will display a weak, moderate or strong correlation.

The correlation coefficient (also known as the $r$r value) measures the strength of a linear correlation. This calculation will be discussed in the next lesson in this chapter.

 

Worked examples

Example 1

Identify the type of correlation in the following scatter plot.

Think: If we draw a straight line through the points, we will be able to look at the gradient of the line and how closely it fits the points. Here is a line that approximates the trend of the data:

Do: The line that we drew to approximate the data has a gradient of around $+1$+1, so this is a positive correlation. The line fits quite closely to all of the points, so it is a strong correlation. In summary, we would say that this scatterplot indicates a strong, positive correlation.

Example 2

Describe the correlation between the two variables; eye colour and IQ.

Think: Does a person's eye colour have anything to do with their IQ?

Do: Eye colour and IQ is an example of a pair of variables that have no correlation.

 

Practice questions

Question 2

The scatter plot shows the relationship between sea temperatures and the amount of healthy coral.

  1. Describe the correlation between sea temperature the amount of healthy coral.

    Select all descriptions that apply.

    Negative

    A

    Strong

    B

    Positive

    C

    Weak

    D
  2. Which variable is the response variable?

    Sea temperature

    A

    Level of healthy coral

    B
  3. Which variable is the explanatory variable?

    Level of healthy coral

    A

    Sea temperature

    B

Question 3

The following table shows the number of traffic accidents associated with a sample of drivers of different age groups.

Age Accidents
$20$20 $41$41
$25$25 $44$44
$30$30 $39$39
$35$35 $34$34
$40$40 $30$30
$45$45 $25$25
$50$50 $22$22
$55$55 $18$18
$60$60 $19$19
$65$65 $17$17
  1. Which of the following scatter plots correctly represents the above data?

    A

    B

    C
  2. Is the correlation between a person's age and the number of accidents they are involved in positive or negative?

    Positive

    A

    Negative

    B
  3. Is the correlation between a person's age and the number of accidents they are involved in strong or weak?

    Strong

    A

    Weak

    B
  4. Which age group's data represent an outlier?

    30-year-olds

    A

    None of them

    B

    65-year-olds

    C

    20-year-olds

    D

Question 4

Consider the table of values that show four excerpts from a database comparing the income per capita of a country and the child mortality rate of the country. If a scatter plot was created from the entire database, what relationship would you expect it to have?

Income per capita Child Mortality rate
$1465$1465 $67$67
$11428$11428 $16$16
$2621$2621 $35$35
$32468$32468 $9$9
  1. Strongly positive

    A

    No relationship

    B

    Strongly negative

    C

Outcomes

U2.AoS1.2

scatterplots and their use in identifying and describing the association between two numerical variables

U2.AoS1.5

use a scatterplot to describe an observed association between two numerical variables in terms of strength, direction and form

What is Mathspace

About Mathspace