In Unit 2 we studied statistics with univariate data. 'Uni' means one (think unicycle) so if we want to observe and analyse changes in a single variable then this is univariate statistics. For example, comparing two student's test results by comparing the mean and standard deviation of each set is working with a single variable (test results) so it is univariate.
In Unit 3 our focus will be statistics with bivariate data. 'Bi' means two (think bicycle) so if we are interested in comparing or finding an association between two sets of different variables then this is bivariate statistics. For example, looking at the association between litres of soft drink consumed per week and BMI for a set of people is working with bivariate data as there are two variables (litres and BMI).
In statistics, a 'variable' refers to a characteristic of data that is measurable or observable. A variable could be something like temperature, mass, height, make of car, type of animal or goals scored.
Data variables can be defined as either numerical or categorical.
Discrete numerical data
Discrete numerical data involve data points that are distinct and separate from each other. There is a definite 'gap' separating one data point from the next. Discrete data usually, but not always, consists of whole numbers, and is often collected by some form of counting.
Examples of discrete data: number of goals scored per match ($1$1, $3$3, $0$0, $5$5, etc) , number of children per family ($0$0, $1$1, $2$2, $3$3, etc), shoe size ($6$6, $6\frac{1}{2}$612, $7$7, $7\frac{1}{2}$712, etc)
Continuous numerical data
Continuous numerical data involves data points that can occur anywhere along a continuum. Any value is possible within a range of values. Continuous data often involves the use of decimal numbers, and is often collected using some form of measurement.
Examples of continuous data: height of trees in metres ($12.357$12.357, $14.022$14.022, $13.454$13.454, etc), times taken to run ten km in minutes ($55.34$55.34, $58.45$58.45, $61.29$61.29, etc), daily temperature in degrees C ($31.2$31.2, $29.4$29.4, $30.4$30.4, etc)
Ordinal categorical data
The word 'ordinal' means 'ordered'. Ordinal categorical data involves data points, consisting of words or labels, that can be ordered or ranked in some way.
Examples of ordinal data: product rating on a survey (good, satisfactory, excellent), Level of achievement (high distinction, distinction, credit, pass, fail)
Nominal categorical data
The word 'nominal' means 'name'. Nominal categorical data consists of words or labels, that name individual data points that have no clear rank order.
Example of nominal data: Nationalities in a team (German, Austrian, Italian, Spanish, etc), eye colour (grey, blue, brown, green, etc)
A scientist collects data on iron levels in soil and growth of a type of weed in order to investigate the relationship between them.
Is this an example of univariate data or bivariate data?
Univariate data
Bivariate data
Classify this data into its correct category:
Weights of dogs
Categorical Nominal
Categorical Ordinal
Numerical Discrete
Numerical Continuous
Which of the following variables can be classified as ordinal categorical?
Length of a pencil in mm
Time taken to get to school in minutes
Weight of dogs in kg
Driving license status (learner, red P, etc)
Hair colour (black, red, blonde, etc)
Hourly rate of pay