topic badge

INVESTIGATION: Fun runs and t-shirts

Lesson

Fun runs and t-shirts

 

A major charity organisation is organising a very large fundraising event in your city, the "City to Beach" fun run. They are expecting to have 65000 entrants in the inaugural event. To raise the profile of the event, every competitor who enters the fun run will receive a promotional t-shirt. The t-shirt design is complete, and now we have been put in charge of ordering an appropriate number of t-shirts in a suitable range of sizes. This is a considerable problem–if we order too few, the fun run entrants will be unhappy if they miss out on a t-shirt that is the correct size, but if we order too many, this could significantly impact the funds raised and we could end up trying to deal with hundreds of unwanted XXXL t-shirts!

A size chart from the t-shirt manufacturer is given below:

We will need to do our own research to obtain any other information and data that we need, and we don't have much time.

 

Statistical investigation process

At the start of this chapter, we learned that the statistical investigation process is a cyclic process that involves several stages:

 

 

Posing the statistical question

Statistical questions have these characteristics:

  • more than one possible answer
  • specifies a population (not necessarily a population of people)
  • depends on statistical methods to answer the question

Question

1. Write a suitable statistical question that captures the requirements of this investigation.

 

Collecting data

We will need to decide how to collect data that can be used for our investigation.

The most obvious option is to ask each competitor to select their preferred shirt size when they enter the fun run. This would be equivalent to a census. Unfortunately, this is not possible because it will not allow enough time for the t-shirts to be manufactured, printed and delivered.

Therefore we must consider obtaining data from primary or secondary sources. To save time and cost, we would prefer to use secondary data from a reliable source.

Question

2. What data do we require to answer the question formed?

3. Consider the data required. Is it easily obtainable?

4. Do we need to make some assumptions to simplify the problem?

5. If we only had data on height or weight but not paired data, which would best to use to assess the number of t-shirts of different sizes required? Why?

6. What sources may we consider to be reliable?

 

One possible source of information is The Australian Bureau of Statistics which has the following data available: How Australians Measure Up.

We could use the breakdown of participants measured weights (found on page 6) shown below.

Measured weight (kg) Males (%) Females (%)
Less than 50 0.2 6.8
50 to < 60 3.8 27.2
60 to < 70 15.6 32.9
70 to < 80 28.8 18.6
80 to < 90 27.0 8.4
90 to < 100 14.7 3.5
100 to < 110 6.7 1.5
110 or more 3.1 1.0
Total 100.0 100.0

Alternatively, if you are familiar with the normal distribution, you could use the mean and standard deviation of body weight data from the table on page 19, together with the assumption that the weight of competitors follows a normal distribution.

 

Analysing and interpreting

This is the stage of our investigation where we "do the maths". It is important that we work carefully and systematically to ensure that our results and conclusions are accurate.

Questions

7. Form a list of assumptions made to utilise the data. Such as:

  • Assuming there are an equal number of male and female runners, that is a population of 32500 male and 32500 female competitors.
  • Assuming competitors weights are distributed similarly to the overall population.
  • t-shirt sizes can be selected by weight (only), using some judgement, with the manufacturer's chart, such as runners who weigh between \editable{} kg and \editable{} kg should prefer a size 'L' shirt.
  • Do the list of weight categories for the t-shirts match those given in the document? Do you need to make an assumption to use these? (Not required if using a normal distribution.)

8. Use the data together with your assumptions to estimate the number of shirts required for each size and record the results in a table similar to that shown below. (Show the weight interval for each size in the first row)

Total t-shirt quantities

Weight              
Size XS S M L XL XXL XXXL
Male              
Female              
Total              

 

We have now determined the quantities of each size of t-shirt that we need to order.

Often it is a good idea to construct a graph to represent our results. This is an excellent way to see for ourselves if the results appear to be reasonable and can also be used when we want to communicate our results and conclusions.

Rather than just presenting a graph without any explanation, we should describe the characteristics of the distribution that the graph displays. If relevant, we should refer to skew or symmetry, clusters and gaps, outliers or any other important information.

Questions

9. What type of graph would be best suited to displaying the results of our analysis? Use the technology to construct graphs showing the t-shirt quantities for each gender and the total.

10. Using mathematical terminology, how could we describe the distribution of t-shirt sizes resulting from your calculations?

 

You will recall that the statistical investigation process is represented as a cyclical process. Now that we have produced our results, we need to consider if this is sufficient for our need, if our assumptions are valid, or if we need to further refine our methods to get more accurate results.

Competitors will be disappointed if they cannot get the size of t-shirt that they need. Perhaps we should order some extra t-shirts in each size. However, we don't want to be too wasteful, and the cost of the extra t-shirts will reduce the amount of money that is raised for charity.

We should certainly review our decisions after the fun run competitors have requested t-shirt sizes so that we can see if our estimates were accurate and be better prepared for the event next year. The requests from this event would be an accurate sample for the next event.

Questions

11. Do the results of our calculations enable us to answer the statistical question that we posed?

12. Were the assumptions made reasonable?

13. Would another measurement be better to use than weight to select the best t-shirt size (e.g. height, Body Mass Index)? Justify your proposal.

14. How would you get more accurate information about the typical age and gender of fun-run participants? Explain your ideas.

 

Communicating

Most often a mathematical investigation is communicated by a written report but sometimes it might be appropriate to make a poster, a slide presentation, a video or even a verbal report.

In any case, the goal of our communications is to convey the important information to others in a systematic, clear and concise way that is best suited to the given task.

When we are creating a written report, there are some guidelines for organising the report so that it is easy for the readers to find the information that they need most easily. The structure of the report should use headings to delineate sections, and we can use images and tables to convey information most effectively.

Our report is meant to a formal document, so typing is preferred over hand-writing. If possible, equations and graphs should be laid out–it is not that hard with modern word processing and spreadsheet software.

A typical structure for a statistical investigation report would have these sections:

  • Introduction
  • Data
  • Analysis
  • Conclusions

This format described above will not be suitable for all investigations, so you may choose to add additional sections, or break up these sections.

 

Introduction

The introduction presents an outline of the investigation, it needs to:

  • clarify the task, clearly stating the problem that you are addressing as a statistical question
  • describe the applicable circumstances
  • identifies the mathematical and statistical content
  • state relevant and important assumptions

 

Data

In this section, we should describe, explain and justify the methods that you used to obtain data. Data can be presented in tables, graphs or lists; preferably using familiar mathematical formats.

  • if you are using primary data, you should explain
    • the choice of census or sample;
    • how you selected your sample;
    • the methods you used to collect the data (e.g. questionnaire, measurement, counting)
    • any problems that you encountered in collecting the data
  • if you have used secondary data, you should state the source of the data and provide relevant information on how it was collected
  • organise and display data, using lists, tables or graphs. When you have a large amount of data, it would be best to summarise the data in this section and refer to an appendix for the full data.

 

Analysis

The analysis section contains the mathematical calculations along with an explanation and justification of the interpretations leading to the conclusions that we draw.

  • describe, explain and justify the mathematical process used;
  • perform the mathematical analysis
    • define the required variable and constant parameters
    • calculations should be presented systematically and explained using mathematical language
    • clearly state the final results of our analysis
  • discuss strengths and weaknesses;
  • if appropriate, propose refinements to the investigation that would lead to stronger, or more useful, conclusions.

If the analysis is extensive, this section could just contain a summary of the mathematical analysis with references to further details in an appendix.

We could also choose to break the Analysis section into separate Results and Discussion sections.

 

Conclusions

The conclusion should be an interpretation of the mathematical and statistical results in the context of the investigation. It should be a concise statement of the most important information and must not introduce any new information.

A good conclusion should concisely:

  • restate the question;
  • state how data was obtained;
  • summarise the mathematical processes used to analyse the data and the results of analysis;
  • state your conclusions, in the context of the original question; and
  • describe important limitations.

Question

15. Write the complete the statistical investigation report for this investigation, following the guidelines provided.

What is Mathspace

About Mathspace