9. Hypothesis Testing

Lesson

Worksheet

9.02 One sample t-tests

9.03 Two sample t-tests

9.04 Chi squared test for goodness of fit

9.05 Chi squared test for independence

Book a Demo

Standard level

9.01 Hypotheses

Lesson

Worksheet

Lesson

The legal system

The Australian legal system, and many others around the world, is a good example of the concept of hypothesis testing. The legal system is based on the idea that someone is innocent until proven guilty. In other words, the default position is that there is no guilt on the part of the defendant. We do not have to prove innocence, it is assumed, and it is the job of the prosecutor to prove guilt beyond a reasonable doubt.

We can call the presumption of innocence the null hypothesis, labelled as $H_0$H0.

The alternative hypothesis, or opposite hypothesis is that the defendant is guilty, labelled as $H_1$H1.

The jury has to make a decision, based on the evidence, as to whether the alternative hypothesis is likely or not.

In mathematics the hypotheses and evidence are usually defined by statistics such as mean and standard deviation. The process of determining whether an alternative hypothesis is valid is called Decision Theory.

Decision theory

Before we discuss statistical hypothesis testing, we need to understand the way in which statisticians develop information about populations.

When we talk about populations, we don't just mean groups of people as in the population of the country. In statistics the population consists of the all of the measurements with which we are concerned.

For example, it might be the population consisting of heights of all Australian males or the population consisting of lengths of all of a certain type of fish in a particular lake.

A key role of a statistician is to infer (make general statements about) certain characteristics of a population by sampling some of it.

The information in a sample drawn from a population is often analysed by using a sample statistic, like the sample mean $\overline{x}$x for example, and these sample statistics become estimates of the corresponding population statistics like the population mean $\mu$μ.

The larger the sample the more representative the sample statistic becomes.

The method by which a statistician infers certain characteristics of a population in this way is generally known as Decision Theory or the Theory of Statistical Inference.

The diagram shows the concept. A sample (in this diagram of size $6$6) is drawn from a population of unknown size and population mean $\mu$μ. A sample statistic (in this case $\overline{x}$x) is calculated.

Using the sample mean, the statistician either tries to infer something about $\mu$μ or else bring its value into question.

The confidence the statistician has in $\overline{x}$x as a dependable estimate of $\mu$μ depends on a number of factors, such as the size of the sample taken and on how the sample was drawn.

Statistical hypothesis testing

In the technique of statistical hypothesis testing, the statistician tests to check the validity of a claimed or hypothesised population mean $\mu$μ. Sometimes when random samples are taken, there can be a significant difference between $\overline{x}$x and $\mu$μ.

Hypothesis testing provides a method that allows the investigator to accept or reject that hypothesised $\mu$μ based on the evidence provided by the sample statistics.

There are a number of different of hypothesis tests that can be used. The following types will be explored in this chapter:

A one sample $t$t-test is used to compare a population mean with a sample mean. We are interested in whether the sample data suggests that the population mean is correct or not.
A two sample $t$t-test is used to compare two population means, given statistics from two samples. We are interested in whether the sample data suggests that the two population means are equal or not.
A $\chi^2$χ2 goodness of fit test is used to test hypotheses about population proportions and whether observed values fit expected values or not.
A $\chi^2$χ2 test of independence is used to test the hypothesis that two variables are independent of each other or not. For example, are height and IQ independent?

The null hypothesis

Any hypothesis test starts by assuming that a claimed or hypothesised average, called the null hypothesis, is true. The null hypothesis is always the default hypothesis. It is the starting point for any hypothesis test.

From a historical perspective, scientists would apply certain treatments to things (people, animals and other objects) to see if those treatments had measurable effects. The default position would always be that a treatment had no effect unless sufficient evidence to the contrary was observed. This explains the origin of the term null hypothesis - it is the 'no effect' hypothesis.

Statisticians denote the null hypothesis $H_0$H0, where the subscript $0$0 implies 'no effect'.

Any null hypothesis, such as $\mu=\mu_0$μ=μ0, is the statement that is under examination. Note that it is always denoted with an 'equals' sign.

It is the statement that is being challenged by the sample evidence.

If evidence can be produced that sufficiently brings into question the validity of the null hypothesis, then we can decide to reject it for some other alternative hypothesis.

This alternative hypothesis is usually denoted as $H_1$H1 . In a one sample $t$t-test where we are testing the null hypothesis that a population mean ($\mu$μ) is correct, the alternative hypothesis takes the form of either:

$\mu\ne\mu_0$μ≠μ0, (two tailed hypothesis - population mean is not equal to a given number, therefore$\mu<\mu_0$μ<μ0 or $\mu>\mu_0$μ>μ0)

or $\mu<\mu_0$μ<μ0, (one tailed hypothesis - population mean is less than a given number)

or $\mu>\mu_0$μ>μ0, (one tailed hypothesis - population mean is more than a given number)

Worked examples

Example 1

A researcher is interested in estimating the average number of children per family in a certain large city. The researcher randomly samples $200$200 family units and determines a sample mean $\overline{x}=2.37$x=2.37. She is concerned that $\overline{x}$x is different to the government gazetted mean of $2.5$2.5 children per family for the entire city and decides to conduct a one sample $t$t-test to check the hypothesis.

Write the null hypothesis and alternative hypothesis for this situation.

Think: The null hypothesis is always that the claimed mean is correct for the population. The alternative will be that the population mean is either less than, greater than, or not equal to the sample mean. The key words in the question are 'is different to' which is the same as saying 'not equal to'.

Do: $H_0$H0:$\mu=2.5$μ=2.5

and $H_1$H1:$\mu\ne2.5$μ≠2.5

Example 2

A manufacturer has a machine that produces light bulbs continuously on a production line. He tests a batch of $12$12 randomly chosen bulbs to see how long they last when lit. He finds that the average life is $2035$2035hours. The manufacturer is thinking about increasing the claimed average of $2000$2000 hours referred to on each of the cardboard light bulb boxes, but needs a sound statistical argument to support the change. He decides to conduct a one sample $t$t-test to check the hypothesis.

Write the null hypothesis and alternative hypothesis for this situation.

Think: In this case the null hypothesis is that the claimed mean is correct for the population. The manufacturer suspects the actual mean is higher, therefore the alternative hypothesis uses 'is greater than'.

Do: $H_0$H0:$\mu=2000$μ=2000

and $H_1$H1:$\mu>2000$μ>2000

example 3

A state football coach is interested in finding out if the mean Yo-Yo test scores of country athletes ($\mu_1$μ1) is higher than city athletes ($\mu_2$μ2). He decides to conduct a two sample $t$t-test.

Write the null hypothesis and alternative hypothesis for this test.

Think: The null hypothesis is always that there is no difference, in other words the population means are the same.

Do: $H_0$H0: $\mu_1=\mu_2$μ1=μ2

and $H_1$H1: $\mu_1>\mu_2$μ1>μ2

example 4

The colours in a packet of smarties are said to be evenly distributed by the manufacturer. Emma opens a packet of $60$60 smarties and finds the distribution as follows:

Pink	Blue	Red	Yellow	Orange	Brown	Purple	Green
$6$6	$8$8	$10$10	$7$7	$6$6	$6$6	$9$9	$8$8

Emma decides to test if this sample is consistent with the manufacturer's claim using a $\chi^2$χ2 goodness of fit test.

Write down the null hypothesis and alternative hypothesis for this test using words.

Think: The null hypothesis is that there is no difference.

Do: The null hypothesis is that the manufacturer's claim is correct and the colours of smarties are evenly distributed.

The alternative hypothesis is that the manufacturer's claim is not correct and the colours of smarties are not evenly distributed.

Example 5

A student wished to investigate how often teachers exercised and decides to conduct a $\chi^2$χ2 test of independence to see whether exercise is independent of subject area. She recorded the teacher's subject area and the number of minutes of exercise per week in the table below:

	$0-60$0−60	$60-180$60−180	$180-360$180−360	$360$360+
Mathematics	$1$1	$3$3	$3$3	$1$1
Science	$4$4	$2$2	$2$2	$0$0
English	$2$2	$3$3	$1$1	$0$0
Physical education	$0$0	$1$1	$3$3	$4$4

State the null hypothesis for this study in words.

Think: The null hypothesis means the 'no effect' hypothesis. Therefore we start with the assumption that subject area has no effect on amount of exercise.

Do: The null hypothesis is that the number of minutes of exercise is independent of a teacher's subject area.

Reflect: You can also use the words 'is not associated with' instead of independent.

9.01 Hypotheses

The legal system

Decision theory

Statistical hypothesis testing

The null hypothesis

Worked examples

Example 1

Example 2

example 3

example 4

Example 5

What is Mathspace

About Mathspace