This slide gives you the two different test statistics that are used for hypothesis testing for a mean of a continuous outcome when you have one sample.
The null hypothesis here is that the mean is equal to some known value, the known mean mu sub zero. The alternative hypothesis can take one of 3 forms: the mean is greater than mu, the mean is less than mu, or the mean is just different from mu without specifying which way. Sometimes you don't know which direction the change from mu has taken.
The last alternative hypothesis, that the mean is either greater or less than mu, is called a "two-sided" test of hypothesis. The first two, that the mean is greater than mu or the mean less than mu are called "upper-tailed" and "lower-tailed" or together called "one-tailed" or "one-sided" tests. There are 2 test statistics shown here, one for large samples and the other for small samples. And depending on whether the sample is large or small, the critical value will either come from Table 1C (Z-statistics) or Table 2 (t-statistics).
An example: The National Center for Health
Statistics reports the mean total cholesterol for adults is 203. Is the mean total cholesterol in Framingham Heart
Study participants significantly different? Suppose we have 3310 participants from Framingham who have a mean total cholesterol of 200.3 with a standard deviation of 36.8.
Is the difference we observe in Framingham statistically significant, or is the difference just due to chance?
We will run through the steps of the hypothesis test.
The first step is to set up the null and alternative hypotheses. The null is that there is no difference. We are asking if the mean cholesterol in Framingham is different, so I am going to use the two-sided alternative that the mean is not 203. And we will do the test at an alpha level of 5%. If the level isn't stated, just use 5%. The test statistical will be the Z-statistic, because we have a very large sample.
If you go to the table of Z-statistics, this is a two-sided test with alpha=0.05, we are going to reject H0 if the Z-statistic is greater than or equal to 1.96 or if it is less than or equal to negative 1.96. This is what we call a two-sided decision rule.
Step 4 is to compute the test statistic by plugging in our observed sample mean (200.3), the mean for the null hypothesis (203), and the standard deviation divided by square root of "n". We get Z=-4.22, leading us to reject the null hypothesis, because -4.22 is less than the lower critical value of -1.96. So we have statistically significant evidence that the mean total cholesterol in Framingham is different from the national mean.
The difference was relatively small, but it was statistically significant, because the Framingham sample is so large. So, we reject H0, but we want to indicate just how significant our data are. Here I copied alpha levels and Z statistics from the table. We used alpha=0.05 as a critical value, and you see the critical value of 1.96 in the middle of the table.
We could have come to the same conclusion and rejected H0 even if we had selected a lower alpha level. An alternative definition for a p-value is that it is the smallest alpha level at which you still reject H0. So, could we have selected an alpha level less than 0.05 and still come to the same conclusion?
Well, we could have selected the smallest alpha on this table (0.0001) and still have rejected H0, because our decision rule would be to reject H0 if Z was greater than 3.819 or less than -3.819. We got negative 4.22, which is less than negative 3.819, so our p-value is less than 0.0001. That means that the data supports the alternative hypothesis. We are rejecting H0 in favor of the alternative hypothesis, and the probability that we are incorrectly rejecting the null hypothesis is <0.0001. This gives an idea of how sure we are about rejecting H0. The smaller the p-value, the more significant is the data. Most statistical computing packages will give you the level of significance, the p-value.
So, this rule can be used to interpret p-values that come from statistical packages or are reported in papers. An investigator can then decide whether to believe the null hypothesis is true or false based on the p-value. And here is the rule that is used to interpret p-values. If p is less than or equal to alpha then reject H0, where alpha is the level of significance that you select. So if alpha is 0.05, then any reported p-value less than or equal to 0.05, you would reject H0. If you decide to be stricter and use
0.01, you would only reject H0 if p is less than or equal to
0.01.
Suppose you had a two-sided test testing if the mean of a population is different 30. We have a sample of 25 observations with a sample mean of 35 and a standard deviation of 10. It is a small sample, so we need a t-statistic. Using a statistical package (Excel, SAS, R, etc.) we get a t-statistic=2.50 and a two-sided p-value of 0.0194. Because we selected a level of significance of 0.05, we reject the null hypothesis, because the p-value is less than 0.05.
If we had selected a level of significance of 0.01, we would not reject the null hypothesis, because the p-value of 0.0194 does not fall below 0.01.
Excel does not have a direct procedure for doing a one-sided test of hypothesis. So, you need to program Excel to compute a t-score or a Z-score, depending on the sample size, and then use Excel functions to compute the p-value.
This shows Excel functions that can be used to compute one-sided or two-sided p-values.