topic badge
CanadaON
Grade 9

Investigation: Planning a statistical Investigation

Lesson

The statistical investigation process is a process of transforming raw data into useful information that can tell us more about a subject and allow us to make recommendations and possibly make predictions of future outcomes. It consists of six stages:

  1. Asking questions
  2. Collecting data
  3. Organizing data
  4. Summarizing and displaying data
  5. Analyzing data and drawing conclusions
  6. Writing a report

Asking questions

The first stage is to pinpoint the final information that will be needed in order to be able to draw a conclusion. This involves coming up with questions that, if answered, would lead to meaningful information that would allow us to draw a conclusion and to make recommendations.

For example, suppose we were in charge of the school’s funds that have been set aside for the development of a new sports field, but we are not sure which type of field (e.g. baseball diamond, basketball court, track) would be of greatest benefit to students. To investigate this issue, we would need to ask questions such as:

  • “What is the most popular sport among students?”
  • “Are there enough funds to construct the students’ preferred type of field?” 
  • "Can the field be multipurpose?"
  • “How long will it take to construct?”

 

activity 1

Consider the following scenarios:

  • The school council would like to find out both the ideal type of meals to offer in the cafeteria in winter and summer. Write some questions that would need to be answered in order draw a conclusion and to make recommendations.
  • A restaurant owner would like to know how many chefs, hosts, waiters, and managers to hire for her new restaurant and would like to know how many staff to roster on during each time of the day. Ask some questions that would need to be answered by the investigation.

 

Collecting data

Once we have written questions, we need to collect data to answer them. We have to decide on how we will collect the data, the type of data we will collect, and the sources from which we will collect them. The sources can be either primary or secondary. Collecting from a primary source involves collecting the data directly ourselves by interviewing or observing others or even conducting experiments. When collecting data using any such methods, it is important to ensure that the data to be collected can be organized easily and is not biased. For example, when creating a questionnaire, it would be better to include questions that are not open-ended, but rather have a limited number of options from which participants can choose their answers. This way, the answers collected can be easily tallied and organized. For instance, instead of asking someone “What is your favourite colour?”, it would be better to ask “Which of the following colours is do you prefer?” and to list a few common colours that they can choose from, including an option of “Other” in case they would like to answer with a colour that is not one of those listed.

Using a secondary source involves gathering data that has already been collected or generated by others. This could involve gathering data from books or the internet. It is important that the data to be collected are from a reliable source and not from some obscure website or outdated book, otherwise the data may not be accurate. Some reliable sources of note are government organizations such as the Statistics Canada and Environment Canada, which have strict data collection methodologies in place to ensure the accuracy and reliability of their data.

 
activity 2

For each of the following scenarios :

  1. Determine whether the data to be gathered to investigate the following would be from a primary or secondary source.
  2. If primary source, state the method of data collection for example, questionnaire, interview, observation, or experiment
  3. If a secondary source, state the type of source that would be used, for example, books, newspapers, research journals, or StatCan.
  • The most popular subject among students at your school
  • The average daily temperature in Ottawa over the last month
  • The number of traffic accidents in the country each year
  • The number of visitors to the local library in an afternoon
  • The average daily temperature in your school over the next week
  • The number of gold medals won by team Canada at the last Winter Olympics
  • Students’ main area of difficulty in a recent science topic

 

activity 3

Compose a (non open-ended) question along with its response options that can be asked in order to investigate:

  • The most popular subject among students at a school
  • The average income of teachers at a school
  • The average number of pets that students in your class own

 

Organizing data

In the third stage, we arrange the data we have collected into a form that gives structure and order to the data. A common way of accomplishing this is to use a table, such as, a frequency table. How this data will be organized will vary as a function of the nature of the statistical investigation. For example, if the data collected were the incomes of a group of workers, it would make more sense to organize the data into categories of income ranges, i.e. to tally up the number of workers within certain income ranges such as \$50000-\$60000 rather than tally up the number of workers with an income of a particular value e.g. the number of workers with an income of \$54682.

 

activity 4

The following are the exam results of a class of 30 Grade 12 physics students.

81 90 93 79 71 88 64 75 59 80
84 72 77 80 73 67 85 76 71 91
78 82 70 75 89 83 74 72 81 80

Draw up a frequency table of the results with suitable groupings. 

 

Summarizing and displaying data

Once we have organized the data, we need to present the data in a form that will be easy to read, understand and analyze. Often this will be accomplished by using a graph such as a bar graph, circle graph, histogram, line plot, or line graph. The particular type of graph to be used will depend on the purpose of the investigation. Besides displaying the data in a graph, it may also be beneficial to summarize the data using statistical quantities such as the mean, median, mode, and range.

Analyzing data

After we have finished summarizing and displaying the data, it is time to examine and interpret the data, to decide on what it means and to ultimately draw conclusions from it. This may involve identifying trends and patterns from the graph, and identifying how those trends and patterns change over time or across categories (such as across different populations). From these trends, we can then draw conclusions and possibly make predictions about future outcomes.

Writing a report

Once we have finished analyzing the data, it is time to put everything together in a written report. A report should consist of: 

  • Introduction: Any report should address the background and aim of the statistical inquiry and the questions it sought to answer, detail the data collection method (including sources and type of data).
  • Numerical and graphical analysis: Data should be analyzed using various statistical measures and it should include the tables and graphs which represent the data provided.
  • Interpretation of results: Consider the questions which were originally asked and interpret the results of the analysis in relation to these questions. Any trends and patterns in the data are considered and the statistics are related to the original problem. This includes a thorough discussion of the findings, listing and explaining the reasoning behind the conclusions, and, if appropriate, recommendations for the future.
  • Conclusion: A report should include a summary of the the findings.

 

activity 5

Come up with a problem which you would like to investigate that would require a large amount of data. Write questions relating to this problem that, if answered, would lead to meaningful information allowing you to draw conclusions and to make recommendations. Work through the following statistical investigation process in order to finally draw some conclusions:

  1. Write some questions that would need to be answered in order to draw a conclusion and to make recommendations.
  2. Will you be using a primary or secondary source to collect the data?
  3. What method (e.g. questionnaire, interview, observation, experiment) will you be using to collect the data?
  4. What type of data will you need?
  5. Collect the data and record them in a frequency table.
  6. Display the data on a suitable graph.
  7. Summarize the data by calculating the mean, median, mode and range.
  8. Analyze the data and identify any trends. 
  9. What are some implications and consequences of your findings?
  10. Write up a report of your inquiry. Be sure to include your tables and graphs from the previous steps.

 

Here are some examples of problems requiring large amounts of data to help you: 

  • What is the average height of 15 year olds in Canada?
  • How many pets do families in Canada most commonly have?
  • What are the most popular television shows in Canada?
  • How much time does a teenager spend on Facebook per day?

Outcomes

9.D1.1

Identify a current context involving a large amount of data, and describe potential implications and consequences of its collection, storage, representation, and use.

9.D1.2

Represent and statistically analyse data from a real-life situation involving a single variable in various ways, including the use of quartile values and box plots.

9.D2.2

Identify a question of interest requiring the collection and analysis of data, and identify the information needed to answer the question.

9.D2.3

Create a plan to collect the necessary data on the question of interest from an appropriate source, identify assumptions, identify what may vary and what may remain the same in the situation, and then carry out the plan.

9.D2.4

Determine ways to display and analyse the data in order to create a mathematical model to answer the original question of interest, taking into account the nature of the data, the context, and the assumptions made.

9.D2.5

Report how the model can be used to answer the question of interest, how well the model fits the context, potential limitations of the model, and what predictions can be made based on the model.

What is Mathspace

About Mathspace