Planning a Statistical Investigation I (Investigation)

A statistical inquiry is a process of transforming raw data into useful information that can tell us more about a subject and allow us to make recommendations and possibly make predictions of future outcomes. It consists of six stages:

  1. posing questions
  2. collecting data
  3. organising data
  4. summarising and displaying data
  5. analysing data and drawing conclusions
  6. writing a report

Posing questions

The first stage is to pinpoint the final information that will be needed in order to be able to draw a conclusion. This involves coming up with questions that, if answered, would lead to meaningful information that would allow us to draw a conclusion and to make recommendations. For example, suppose you were in charge of the school’s funds that have been set aside for the development of a new sports field, but aren’t sure which type of field (e.g. cricket pitch, basketball court) would be of greatest benefit to students. To investigate this issue, you would need to ask questions such as “What is the most popular sport among students?” (you want to construct a type of field that would satisfy the majority of students), “Are there enough funds to construct the students’ preferred type of field?” (you can’t construct a type of field that you can’t afford) and “How long will it take to construct?” (there’s no point constructing the students’ preferred type of field if it takes 10 years to construct at which point none of those students will be around to enjoy it). With answers to these questions, you would then be able to decide which type of field would benefit students the most.


  • The military’s research and development team would like to find out both the ideal type of uniform for its soldiers to wear in combat and the ideal type for them to wear in parades. Pose some questions that would need to be answered by the team in order draw a conclusion and to make recommendations.
  • A restaurant owner would like to know how many chefs, waiters, cashiers and managers to hire for his new restaurant and would like to know how many staff to roster on during each time of the day. Pose some questions that would need to be answered by the investigation.

Collecting data

Once we have posed questions, we need to collect data to answer them. Before we do the actual collecting, we have to decide on how we will collect the data, the type of data we will collect and the sources from which we will collect them. The sources can be either primary or secondary. Collecting from a primary source involves collecting the data directly yourself by interviewing or observing others or even conducting experiments. When collecting data using any such methods, it is important to ensure that the data to be collected can be organised easily. For example, when creating a questionnaire, it would be better to include questions that are not open-ended, but rather have a limited number of options from which participants can choose their answers. This way, the answers collected can be easily tallied and organised. For instance, instead of asking someone “What is your favourite colour?”, it would be better to ask “Which of the following colours is your favourite?” and to list a few common colours that they can choose from, including an option of “Other” in case they would like to answer with a colour that is not one of those listed.

Using a secondary source involves gathering data that has already been collected or generated by others. This could involve gathering data from books or the internet. It is important that the data to be collected are from a reliable source and not from some obscure website or outdated book, otherwise the data may not be accurate. Some reliable sources of note are government organisations such as the Australian Bureau of Statistics and the Bureau of Meteorology, which have strict data collection methodologies in place to ensure the accuracy and reliability of their data.


  1. Determine whether the data to be gathered to investigate the following would be from a primary or secondary source. Also state the method (eg questionnaire, interview, observation, experiment), if the source is to be a primary one, or the source (eg books, newspapers, internet), if the source is to be a secondary one, you would use to gather the data.
  1. the most popular subject among students at school
  2. the average daily temperature in Sydney over the last month
  3. the number of traffic accidents in the country each year
  4. the number of visitors to the local library in an afternoon
  5. the average daily temperature in your home over the past week
  6. the number of goals scored by the Socceroos since the last World Cup
  7. students’ main qualm with the school principal
  1. Compose a (non open-ended) question along with its response options that can be asked in order to investigate:
  1. the most popular subject among students at a school
  2. the average income of teachers at a school
  3. the average mark of students in your class in last term’s maths test

Organising data

In the third stage, we arrange the data we have collected into a form that gives structure and order to the data. A common way of accomplishing this is to use a table e.g. a frequency table. How this data will be organised will vary as a function of the nature of the statistical investigation. For example, if the data collected were the incomes of a group of workers, it would make more sense to organise the data into categories of income ranges i.e. to tally up the number of workers within certain income ranges such as $50,000-$60,000 rather than tally up the number of workers with an income of a particular value e.g. the number of workers with an income of $54,682.


The following are the HSC results of a class of 30 year 12 physics students.

81 90 93 79 71 88 64 75 59 80
84 72 77 80 73 67 85 76 71 91
78 82 70 75 89 83 74 72 81 80

Draw up a frequency table of the results with suitable groupings. (HINT: HSC results are usually grouped into bands.)

Summarising and displaying data

Once we have organised the data, we need to present the data in a form that will be easy to read, understand and analyse. Most often this will be accomplished by using a graph such as a column graph, bar graph, pie chart, dot plot or line chart. The particular type of graph to be used will depend on the purpose of the investigation. For example, in order to present data on the proportion of students with a particular type of favourite sport, it may be more appropriate to use a pie chart than a dot plot. Besides displaying the data in a graph, it may also be beneficial to summarise the data using statistical quantities such as the mean, median, mode and range.

Analysing data

After we have finished summarising and displaying the data, it is time to examine and interpret the data, to decide on what it means and to ultimately draw conclusions from it. This may involve identifying trends and patterns from the graph, and identifying how those trends and patterns change over time or across categories (such as across different populations). From these trends, we can then draw conclusions and possibly make predictions about future outcomes.

Writing a report

Once we have finished analysing the data, it is time to put everything together in a written report. Any report should address the background and aim of the statistical inquiry and the questions it sought to answer, detail the data collection method (including sources and type of data), involve a thorough discussion of the findings, list and explain the reasoning behind the conclusions, and, if appropriate, include recommendations for the future. It should also include the tables and graphs from steps 2 and 3 of the inquiry (even if only as part of the appendix).


Suppose the Roads and Traffic Authority (RTA) has tasked you with investigating the number of vehicles travelling past the front of your school on an average day in order to determine whether there is any need to implement new measures to manage traffic flow.

  1. Pose some questions that would need to be answered in order to draw a conclusion and to make recommendations.
  2. Will you be using a primary or secondary source to collect the data?
  3. What method (eg questionnaire, interview, observation, experiment) will you be using to collect the data?
  4. What type of data (eg vehicle type, vehicle weight, vehicle height) will you be collecting?
  5. Collect the data and record them in a frequency table.
  6. Display the data on a suitable graph. Summarise the data by calculating the mean, median, mode and range.
  7. Analyse the data and identify any trends. When is traffic flow highest? When is it lowest?
  8. Write up a report of your inquiry. Be sure to include your tables and graphs from the previous steps.



Carry out investigations of phenomena, using the statistical enquiry cycle: A conducting experiments using experimental design principles, conducting surveys, and using existing data sets B finding, using, and assessing appropriate models (including linear regression for bivariate data and additive models for time-series data), seeking explanations, and making predictions C using informed contextual knowledge, exploratory data analysis, and statistical inference D communicating findings and evaluating all stages of the cycle.


Conduct an experiment to investigate a situation using experimental design principles

