The data cycle is the process where we formulate questions, then collect, display, and explain mathematical data.

To help us formulate or write our question, we can think about whether we will get categorical data or numerical data.

A clear question helps us know what kind of data to gather and who to collect it from. The type of question we ask can lead us to collect different data.

The group we are hoping to answer the question about is called the **population**.

When we write a question, it should be about the population we want to learn about and have more than one possible answer.

Non-examples of questions | Examples of well formulated questions |
---|---|

How old is my neighbor? | What ages are the people in my neighborhood? |

What brand is my computer? | What is the most popular brand of computer among my classmates? |

What is your favorite color? | What colors are preferred by students in my neighborhood? |

It needs to be clear which **attributes** we are exploring with our question. An attribute is a specific characteristic or feature of a given subject.

For example, if we want to learn more about pets in our class, we need to be clear which attribute we are interested in. These could include:

Number of pets

Type of pets

Size of pets

Age of pets

All the students in your school take a survey with these questions.

a

Which question(s) will have discrete numerical data as their results?

A

How many pets do 6th graders have?

B

In your class, how many bones has each student broken?

C

How long does it take students at my school to get to school everyday?

D

What kinds of pets are owned by my classmates?

Worked Solution

b

Which question(s) will have categorical data as their results?

A

How many cousins do my classmates have?

B

What country was your most recent vacation in?

C

How many people are in the average residence in Virginia?

D

How tall are 6th graders?

E

What middle school grade level has the most students?

Worked Solution

Is each question well formulated for the data cycle? Explain why or why not.

a

Who was the first president?

Worked Solution

b

How do the shoe sizes of 5th and 6th graders at my school compare?

Worked Solution

c

How much money do professional athletes in the US make?

Worked Solution

Idea summary

We use the data cycle to formulate questions, then collect, show, and explain information. Depending on the question being asked, the data may be **categorical data** or **numerical data**.

A well formulated question should have more than one possible answer and relate to a population.

When we have questions, we use different ways to collect data to find answers:

**Observation**: Watching and noting things as they happen.For example, watching birds at a feeder to see which type comes most often.

**Measurement**: Using tools to find out how much, how long, or how heavy something is.For example, using a ruler to measure the growth of a plant over several weeks.

**Survey**: Asking people questions to get information.For example, asking classmates about their favorite school subject and recording the answers.

**Experiment**: Doing tests in a controlled way to get data.- For example, planting two identical plants, giving one sunlight and the other only artificial light, and observing the differences.

Acquire existing

**secondary data**: Use data which was collected by a reliable source like census data, Common Online Data Analysis Platform (CODAP), or peer reviewed studies.

To help us choose a method, we need to be sure that it is realistic based on our sample. For example, it might be too time consuming to do an experiment in an hour, so we can acquire existing data instead.

Our chosen method must also be **ethical**. This means that no one gets hurt, asked inappropriate questions, or experimented on without consent.

When doing a survey or using secondary sources, it is important that the data is collected from a **sample** that is **representative** of the population, so that our analysis of the data is valid.

Representative means that characteristics of the population should be similar to the sample.

For example, if we are collecting data to answer the question "What is the most popular restaurant in my city?" and only surveyed people at our favorite Mexican restaurant, then this sample would not include people who prefer other types of food.

In general the larger our sample is, the more likely it is to be a good representation of the population.

In previous grades, we have used pictographs, bar graphs, line graphs, line plots, and stem-and-leaf plots.

For a short exploration of the data cycle, let the population be your class.

Formulate a question that you could easily collect data on.

What type of data would be collected: categorical or discrete numerical?

Describe a realistic process for collecting the data.

Collect the data.

Represent the data visually.

What does this data tell you about your original question?

Aditya wants to investigate the social media habits of the students in her grade.

a

Formulate a question to help her complete this investigation.

Worked Solution

b

What attributes would you need to measure to answer the question?

Worked Solution

c

Should she use observation, measurement, survey, or experiment to collect the data? Explain.

Worked Solution

Collect data that can be used to answer the question "How many first cousins do students in my school have?"

Worked Solution

Georgia wants to know about the current employment of Americans. She randomly selects 10 adults who came to pick up or drop off students from her 6th grade class to survey.

Determine the factors that could mean that the data collected is not representative of the population.

A

The sample won't include a variety of ages of all those who are in the workforce.

B

The adults in the sample were not randomly selected.

C

The survey question is open ended.

D

The sample size is too small for such a big population.

Worked Solution

A sample of 25 people is drawn from a population. In this sample, the youngest is 18 years old, and the oldest is 64.

Which two of these might be populations that this sample was drawn from?

A

Residents of a retirement home

B

Employees at a bank

C

Students from a elementary school

D

Drivers stuck in traffic during rush hour

Worked Solution

Idea summary

After formulating a clear question, we use the data cycle to collect, show, and explain information. To get data, we can use methods like:

Watching

**(Observation)****Measuring**Asking questions

**(Survey)**Doing

**experiments**Acquiring existing

**secondary data**

It's important to choose the right method based on the question we have.

We need to ensure that whichever method we use, that we collect data from a **sample** that is representative of the **population**.