In hypothesis testing we are interested in whether there is sufficient evidence to accept the alternative hypothesis. We make that decision based on the likelihood of the sample statistic occurring given the null hypothesis, which is called the $p$p-value. If the chance of the sample statistic occurring is high then that indicates that the null hypothesis is correct because the sample statistic and the population statistic match each other. If the $p$p-value is very small then the sample statistic is unlikely to occur, but it did occur, so this does not match the population statistic. This suggests that maybe the null hypothesis is not correct and should be rejected.
For example, consider a dispensing machine that is meant to be delivering $200$200 ml of water into each cup, so $H_0$H0 is $\mu=200$μ=200 and $H_1$H1 is $\mu\ne200$μ≠200. Would a sample mean of $193$193 ml be sufficiently different to $\mu=200$μ=200 to reject $H_0$H0? While a sample mean like this is entirely possible, it might be just too improbable to accept it - maybe there is something wrong with the machine causing this low amount?
Imagine if we repeat the experiment over and over again, so that we calculate lots of sample means from the corresponding batches of $30$30 cups. From all of these sample means we could construct a frequency histogram.
What would the histogram look like?
We would expect that most of the sample means would cluster close to the claimed average of $200$200 ml - some higher and some lower. There would be some however that would be further away (where the dispenser has come up short or delivered too much) but the frequency of these would reduce the further away from $200$200 ml the sample means are.
Such a histogram is referred to as a sampling distribution.
A sampling distribution is nothing more than a distribution of the sample means.
One of these sample means ($\overline{x}=193$x=193) has been highlighted as a small red box.
Note how the sampling distribution is approximately normal with the mean at $200$200 ml.
If our null hypothesis is correct, then the difference between the population mean $\mu=200$μ=200 ml and the sample mean $\overline{x}=193$x=193 ml should be fairly small. However it's not as easy as just subtracting the two values. We must also take into consideration the spread or standard deviation of the sample data.
The $t$t-statistic is very similar to a $z$z-score with the standard normal distribution. It uses the population mean ($\mu$μ), the sample mean ($\overline{x}$x), the sample standard deviation ($s$s) and the sample size ($n$n) to calculate a value that indicates how far the sample mean is from the population mean.
If the population is approximately normally distributed then the distribution of its $t$t-statistics for samples of size $n$n is called a $t$t-distribution with $n-1$n−1 degrees of freedom ($df$df).
The $t$t-statistic is then used to calculate the likelihood of the sample mean occurring (called the $p$p-value), and hence help make a decision on whether the null hypothesis should be rejected.
$t=\frac{\overline{x}-\mu_0}{\frac{s}{\sqrt{n}}}$t=x−μ0s√n
You will not be asked to calculate $t$t-statistic values manually. You will use your calculator to find these values.
TI-nspire calculator instructions |
Casio fx-CG 50 calculator instructions |
TI-84 Plus CE calculator instructions |
---|---|---|
Choose A Calculate or Add Calculator | Press menu then select Statistics | Press stat |
Press menu then select 6 Statistics | Press F3 for TEST then F2 for t | Select 2: T-Test from the TESTS menu |
Press 7 Stat Tests then 2 t Test | Press F1 for 1-Sample | Use the Stats tab and type in the data |
Check the input method is Stats and type in the data | Type in the data | Highlight Calculate |
Press OK then OK to view the results | Scroll down to Execute and press EXE | Press enter to display the test statistic. Note that the p-value is also displayed. |
The company Speedy manufactures remote controlled cars, some of which come off the assembly line defective. Speedy believe that the proportion of defective cars coming off their assembly line is about $6%$6% but a recent random sample of $100$100 cars contained $8$8 defective ones with a standard deviation of $2.37$2.37. Speedy wonder if the proportion of defective cars is actually higher than the $6%$6%.
(a) Define the set of hypotheses for this situation.
Think: The null hypothesis is that the population mean is correct and the alternative is that it is larger.
Do: $H_0:\mu=0.06$H0:μ=0.06, $H_1:\mu>0.06$H1:μ>0.06
(b) Use your calculator to find the value of the $t$t-statistic, correct to three significant figures.
Think: Use the appropriate statistics application on your calculator.
Do: Input the data $\mu_0=0.06$μ0=0.06, $\overline{x}=0.08$x=0.08, $S_x=2.37$Sx=2.37,$n=100$n=100
The value of the $t$t-statistic is $0.0844$0.0844.
Reflect: Consider the image below of the $t$t-statistic plotted on the $t$t-distribution curve. To find the likelihood of a score larger than this value we will find the area under the curve to the right of the $t$t-value.
The probability of a $t$t-statistic being observed, given that $H_0$H0 is true, is called its $p$p-value. A low $p$p-value indicates that the chance of the sample mean occurring is low, and the fact that it did occur means the likelihood of the null hypothesis (population mean in this case) being incorrect is high.
Similar to calculating probabilities with normal distributions, we can use our graphing calculator to determine the area under the curve of a $t$t-distribution to the right or left of a given $t$t-statistic to find the $p$p-value.
Calculate the $p$p-value for Example 1 above.
Think: The alternative hypothesis for this situation is that the population mean is greater than $0.06$0.06. We know that the $t$t-statistic for $\mu=0.06$μ=0.06 is $0.0844$0.0844.
We call this a 1-tailed test where $p$p = $P(t>0.0844)$P(t>0.0844) for $T~t_{100-1}$T~t100−1
Do: Use the $t$t-test option on your calculator to find $p=P(t>0.0844)$p=P(t>0.0844). Note that many calculators display the $p$p-value at the same time as the $t$t-value as described above in Example 1.
The probability that the sample mean occurs, given the null hypothesis is $0.4665$0.4665. We can also say the $p$p-value = $0.4665$0.4665.
This $p$p-value means there is a $46.65%$46.65% chance that a sample with $8%$8% of cars defective will occur, given the population mean is $6%$6% of cars defective.
But how do we know if this is high enough to accept the null hypothesis?
The $p$p-value is always compared to a pre-determined value called the significance level $\alpha$α. If the $p$p-value falls below $\alpha$α then we may choose to reject $H_0$H0. Common significance levels are $0.01$0.01, $0.02$0.02, $0.05$0.05 and $0.10$0.10. Significance levels can be given as a decimal or as a percentage.
A significance level of $10%$10% means that there is a $10%$10% chance of incorrectly rejecting $H_0$H0. In other words a $10%$10% chance of making a Type 1 error.
A Type 1 error is when the null hypothesis is true, but we reject it.
A Type 2 error is when the null hypothesis is false, but we do not reject it.
If our hypothesis test is performed with a $10%$10% significance level we compare our $p$p-value with $10%$10%. For Speedy's defective cars the $p$p-value of $46.65%$46.65% is higher than $10%$10% and we say there is no evidence to reject the null hypothesis.
The $p$p-value is the probability of a test statistic being observed if $H_0$H0 is true.
The higher the $p$p-value, the harder it becomes to reject the null hypothesis.
The $p$p-value is always compared to a pre-determined significance level, $\alpha$α.
If $p<\alpha$p<α then we say we reject $H_0$H0.
The contents of $25$25 cans of soft drink labelled as $300$300 ml were measured and results were as follows:
$301$301, $298$298, $289$289, $302$302, $299$299, $295$295, $300$300, $301$301, $297$297, $298$298, $300$300, $301$301, $302$302, $299$299, $295$295, $303$303, $297$297, $301$301, $300$300, $302$302, $302$302, $299$299, $298$298, $296$296, $302$302
Quality control wants to check that can contents are accurate within a $1%$1% significance level. Assume that can contents data is approximately normally distributed.
(a) Define the hypothesis set for this problem.
Think: $H_0$H0: is that the population mean is correct, in this case $300$300 ml.
Do: Write $H_0:\mu=300,H_1:\mu\ne300$H0:μ=300,H1:μ≠300
(b) Find the $p$p-value for the sample, correct to two decimal places.
Think: This time we haven't been given the sample statistics, but we can calculate them. First enter the data into List $1$1 of your graphing calculator. Either use the statistics facility of your calculator to find the sample mean and standard deviation, or choose the DATA option when using your calculator to find the $t$t-value/$p$p-value.
Do: Write - the $p$p-value is $0.15$0.15.
Reflect: The calculator will also show that the $t$t-statistic is $-1.48$−1.48.
We call this a 2-tailed test where $p$p = $P(t>1.48,t<-1.48)$P(t>1.48,t<−1.48); for $T~t_{25-1}$T~t25−1.
In other words, the shaded area of this diagram is $0.15$0.15:
(c) State the sample mean correct to two decimal places.
Think: Your graphing calculator will find this from the data entered in List $1$1.
Do: Write - $\overline{x}=299.08$x=299.08
(d) Considering the significance level, does there appear to be a can contents problem for this brand? Explain your answer.
Think: Compare the $p$p-value of $15%$15% with the significance level of $1%$1%.
Do: Write - No there does not appear to be a problem and the mean amount of $300$300 ml will not be rejected because the $p$p-value is higher than the significance level of $1%$1%. The sample mean is probable for $\mu=300$μ=300 at this significance level and we do not reject the null hypothesis.