topic badge

9.02 Classifying data

Lesson

Concept summary

Data can be classified in a variety of ways.

Data which can be measured, such as the heights of a group of people, or counted, such as the number of siblings, is numerical data. Many calculations can be performed on measurable data, such as finding an average value, or a minimum or maximum value.

Data which is divided into groups (or categories) rather than measured or counted, such as the type of pet(s) owned by a group of people, is categorical data.

Numerical data

Data which can be measured or counted using numbers.

Categorical data

Data which is divided into groups.

Data can then also be classified by how many characteristics are being measured or counted.

If one characteristic is being measured, such as people's heights, the data is univariate. If two characteristics are being compared, such as people's heights vs. their weights, the data is bivariate.

Univariate data

Data that measures only one characteristic of a population.

Bivariate data

Data that measures two characteristics of a population.

A data distribution is a graphical or organized display of all data in the data set.

Worked examples

Example 1

Determine if the data distribution is univariate or bivariate.

a

Data describing the number of years an actor has been working for and their annual earnings.

Approach

To determine whether a data distribution is univariate or bivariate consider whether it describes one or two characteristics of the group its taken from.

Solution

This data measures two characteristics: number of years the actor has worked for and annual earnings.

Therefore, this is an example of bivariate data.

b

The shoe sizes of an entire football team.

Approach

To determine whether a data distribution is univariate or bivariate consider whether it describes one or two characteristics of the group its taken from.

Solution

This data measures only one characteristic: shoe size.

Therefore, this is an example of univariate data.

Example 2

Classify the data set as categorical or numerical.

a

The time spent, in minutes, driving to work of 100 randomly selected employees.

Solution

Time spent in minutes can take on any numerical value greater than zero. In particular, we can measure the time taken and perform calculations on the data.

Therefore, this is an example of numerical data.

b

The eye color classification of 100 students.

Solution

Eye color classification has qualitative values (blue, brown, green, hazel, black) that we can't measure or perform calculations on.

Therefore, this is an example of categorical data.

c

Number of questions in a test.

Solution

Test questions can take on whole number values, such as 5 or 20, so we can meaningfully count the number of test questions and perform calculations on the data.

Therefore, this is an example of numerical data.

Outcomes

MA.912.DP.1.2

Interpret data distributions represented in various ways. State whether the data is numerical or categorical, whether it is univariate or bivariate and interpret the different components and quantities in the display.

What is Mathspace

About Mathspace