topic badge
Middle Years

13.01 Types of data

Lesson

Statistical data can be divided two types, categorical and numerical.

Categorical data

Data that is collected as a set of words is called categorical data

Imagine asking someone for their favourite colour, country of birth, or gender. Their answer would always be a word. We can also think of categorical data as values which can be sorted into groups or categories.

Numerical data

When the data is a set of numbers, it is called numerical data.

Imagine asking someone for their height, their age, or how long they spend on social media each day. Their answers would always be a number. 

Numerical data is divided into two types, continuous and discrete. 

Discrete numerical data is counted, so its values are separated. If you asked someone to tell you their shoe size they might say "$10$10", they might even say "$10$10 and a half", but they would not say "$10$10 and seven sixteenths". 

Continuous data is measured, so it can take any value within a range - there are an infinite number of possible values. If we measure an animal's height, we might find any reasonable value, limited only by the precision of our ruler.

Data types

Categorical data is made up of words.

Numerical data is made up of numbers.

  • Discrete numerical data is counted.
  • Continuous numerical data is measured.

Question biases

When we conduct surveys, the responses together form a set of data that we can analyse. It is essential that survey questions are clear, direct, and use neutral language. If our survey is biased, the respondents may feel pressured to give certain answers, or end up confused, and our analysis could be very misleading.

There are three broad biases to watch out for:

Emotive or leading language is language which is not neutral and evokes an emotional reaction from the responder. 

Imagine playing a song for someone and asking them: "Don't you think this song is totally amazing?". They may feel pressured to say "Yes".

If you asked them "Do you actually like this terrible song?", they may feel pressured to say "No".

Instead, asking them the simple, neutral question "Do you like this song?" is the best way to find out their true opinion.

 

Questions may make false assumptions about the responder which can make them difficult or impossible to answer.

Consider asking someone to write down a response to this question: "Does your dog like going for walks?". How would someone answer this question if they don't have a dog?

Neither "Yes" nor "No" would make sense, so they may leave the answer blank. If you look at their response later, how would you know why they left it blank?

To solve this problem, we can break the question into two parts. The first question could be, "Do you own a dog?". The second question could be, "If you own a dog, does your dog like going for walks?". This removes the confusion.

 

On the other hand, sometimes there might be more than one question combined into a single question, leaving the respondent confused or unsure of how to answer.

Consider this question: "Do you like cats and dogs?".

This is confusing. What is the question really asking? Is it asking if we like both cats and dogs, or if we like cats and separately if we also like dogs?

We could improve the question by having four options: "yes", "yes both", "yes but only cats" and "yes but only dogs". 

But an even better way would be to split the question into two questions, "Do you like cats?" and "Do you like dogs?", both with yes or no answers. This method is the most clear and direct way of obtaining the same data.

Generally speaking, we want each question in a survey to only ask one question.

 

By avoiding these biases we can be more sure about the data we collect to analyse later.

Practice questions

Question 1

A class was surveyed about where they went on their most recent holiday.

  1. What kind of data are the survey results?

    Categorical

    A

    Discrete numerical

    B

    Continuous numerical

    C

Question 2

All the students in your school take a survey with the four questions below.

Which two will have discrete numerical data as their results?

  1. How many pets do you own?

    A

    How many times have you broken your arm?

    B

    How long does it take for you to get to school every day?

    C

    What kinds of pet do you own?

    D

Question 3

A survey asks the question below.

The Prime Minister believes that taxes are too high, do you think taxes are too high?

  1. What makes this a poor survey question?

    The question asks more than one question

    A

    The question makes a false assumption

    B

    The question uses emotive or leading language

    C

What is Mathspace

About Mathspace