In an experiment where something is to be measured, or in a survey in which beliefs about some issue are being sought, we typically repeat the procedure many times in order to be confident that the correct result has been found. Similarly, in studies designed to reveal a relationship between two variables, we conduct repeated trials in order to decide whether a correlation exists.
Each trial tends to give a different result and thus, a single experiment or survey produces a quantity of data which then needs to be analysed using statistical tools. We may be looking for the mean observed value of a physical quantity or the majority opinion about some issue or the strength of a correlation. In this sense, we call such a study a statistical investigation.
To begin a statistical investigation, we must first formulate a question:
What is the mass of one litre of air? What is the average number of potatoes in the 5kg packs sold in a particular supermarket today? Is there a connection between students' travelling distances from home to school and their results in a high-stakes test?
Your own question could take a form like one of these sample questions.
Next, design an experiment or survey to gather the necessary data that will enable you to answer the question.
To measure the mass of a litre of air, one might remove all the air from a one-litre container and see how much less its mass is than when it was full. Other methods are possible and there are refinements to the question relating to changes in pressure and temperature. But, whatever the details of the experimental method, it should be repeated several times to generate the data.
To count the potatoes, it may be best to take a sample from the stock of 5kg packs and to count the potatoes in each pack in the sample. It would be impractical to count the whole population of 5kg packs. Your experimental design should specify the steps to be taken to ensure that the sample drawn was representative of the whole stock.
Data about students' travelling distances and test results might be obtained from school records. This would be better than conducting a survey within the school because the data would be available already for all the students. A survey would be less accurate. However, if the conclusions were to be extrapolated to other schools, then one would have to consider whether the school from which the data came was typical of all the other schools. The particular school from which the data came would in effect be a sample drawn from the whole collection of schools.
Depending on your overarching question, you may need to design a questionnaire that would be administered to survey subjects. The design of a questionnaire is a critical task. The questions need to be simple to understand, they should relate unambiguously to the overall question, and they should avoid hinting at a response the investigator may prefer. There are many ways in which bias can be introduced to a survey through the improper design of a questionnaire.
Having collected the data, the next task will be to summarise it in a way that makes it possible to perceive patterns or features that it may contain. Graphical displays of various kinds can be used: bar charts, histograms, pie charts, stem-and-leaf plots, scatterplots, and so on.
The average of the values obtained for the mass of one litre of air would be calculated together with information about how variable the experimental results were. The average number of potatoes in each pack in the sample would be worked out together with details about how much above and below the average the individual packs went. A line of best fit could be found for the students' travelling times question together with a correlation coefficient indicating the strength of the linear relationship with the test results.
Finally, a statistical investigation should present its findings in written form in a manner that communicates effectively with those who are likely to be interested in it. Thus, scientific investigators need to cultivate habits of clear explanation and accurate, concise writing.
Plan and conduct surveys and experiments using the statistical enquiry cycle:– determining appropriate variables and measures;– considering sources of variation;– gathering and cleaning data;– using multiple displays, and re-categorising data to find patterns, variations, relationships, and trends in multivariate data sets;– comparing sample distributions visually, using measures of centre, spread, and proportion;– presenting a report of findings
Evaluate statistical investigations or probability activities undertaken by others, including data collection methods, choice of measures, and validity of findings