topic badge

1.01 Types of data

Lesson

Univariate versus bivariate statistics

In Unit 2 we studied statistics with univariate data. 'Uni' means one (think unicycle) so if we want to observe and analyse changes in a single variable then this is univariate statistics. For example, comparing two student's test results by comparing the mean and standard deviation of each set is working with a single variable (test results) so it is univariate.

In Unit 3 our focus will be statistics with bivariate data. 'Bi' means two (think bicycle) so if we are interested in comparing or finding an association between two sets of different variables then this is bivariate statistics. For example, looking at the association between litres of soft drink consumed per week and BMI for a set of people is working with bivariate data as there are two variables (litres and BMI).

Examples

Example 1

A scientist collects data on iron levels in soil and growth of a type of weed in order to investigate the relationship between them.

Is this an example of univariate data or bivariate data?

A
Univariate data
B
Bivariate data
Worked Solution
Create a strategy

Count the number of variables that are being studied.

Apply the idea

The scientist collects two sets of data in this example: (1) the iron levels in soil and (2) the growth of a type of weed, and wants to investigate the relationship between them. Therefore, this is an example of bivariate data, option B.

Idea summary

Bivariate statistics is when we are interested in comparing or finding an association between sets of data for two different variables.

Types of data

In statistics, a 'variable' refers to a characteristic of data that is measurable or observable. A variable could be something like temperature, mass, height, make of car, type of animal or goals scored.

Data variables can be defined as either numerical or categorical.

  • Numerical data is where each data point is represented by a number. Examples include: number of items sold each month, daily temperatures, heights of people, and ages of a population. The data can be further defined as either discrete (associated with counting) or continuous (associated with measuring). Numerical data is also known as quantitative data.

  • Categorical data is where each data point is represented by a word or label. Examples include: brand names, types of animals, favourite colours, and names of countries. The data can be further defined as either ordinal (it can be ordered) or nominal (un-ordered). Categorical data is also known as qualitative data.

A chart showing categories of data such as numerical and categorical. Ask your teacher for more information.

Discrete numerical data involve data points that are distinct and separate from each other. There is a definite 'gap' separating one data point from the next. Discrete data usually, but not always, consists of whole numbers, and is often collected by some form of counting.

Examples of discrete data: number of goals scored per match (1, 3, 0, 5,etc), number of children per family (0, 1, 2, 3,etc), shoe size 6, 6 \dfrac{1}{2}, 7, 7 \dfrac{1}{2},etc).

Continuous numerical data involves data points that can occur anywhere along a continuum. Any value is possible within a range of values. Continuous data often involves the use of decimal numbers, and is often collected using some form of measurement.

Examples of continuous data: height of trees in metres (12.357, 14.022, 13.454,etc), times taken to run ten km in minutes (55.34, 58.45, 61.29,etc), daily temperature in degrees celsius (31.2, 29.4,30.4,etc).

The word 'ordinal' means 'ordered'. Ordinal categorical data involves data points, consisting of words or labels, that can be ordered or ranked in some way.

Examples of ordinal data: product rating on a survey (good, satisfactory, excellent), Level of achievement (high distinction, distinction, credit, pass, fail)

The word 'nominal' means 'name'. Nominal categorical data consists of words or labels, that name individual data points that have no clear rank order.

Example of nominal data: Nationalities in a team (German, Austrian, Italian, Spanish, etc), eye colour (grey, blue, brown, green, etc)

Examples

Example 2

Classify this data into its correct category: Weights of kittens

A
Quantitative Discrete
B
Qualitative Nominal
C
Quantitative Continuous
D
Qualitative Ordinal
Worked Solution
Create a strategy

Determine if the data is numerical or in categories.

Apply the idea

A weight of a kitten can be measured. So it is numerical or quantitative.

Weight is a measurement that can have any number of decimal places, so it is continuous. The correct answer is Option C.

Example 3

Which of the following variables can be classified as ordinal categorical?

A
Length of pencil in mm
B
Time taken to get to school in minutes
C
Weights of dogs in kg
D
Driving license status (learner, red P, etc)
E
Hair colour (black, red, blonde, etc)
F
Hourly rate of pay
Worked Solution
Create a strategy

Choose which among the options can be ordered (or ranked) and in categories.

Apply the idea

Only options E and D are categorical data as the data will be words or labels.

Between these two options, the answer is Option D because it is categorised and ranked based on a certain criteria.

Idea summary
  • Numerical data is where each data point is represented by a number. The data can be further defined as either discrete (associated with counting) or continuous (associated with measuring). Numerical data is also known as quantitative data.

  • Categorical data is where each data point is represented by a word or label. The data can be further defined as either ordinal (it can be ordered) or nominal (un-ordered). Categorical data is also known as qualitative data.

Outcomes

ACMGM048

review the statistical investigation process; for example, identifying a problem and posing a statistical question, collecting or obtaining data, analysing the data, interpreting and communicating the results

What is Mathspace

About Mathspace