In the research literature, results of statistical tests are usually reported using the p-value. The p-value provides an objective measure of the strength of evidence which the data supplies in favour of the null hypothesis. It is the probability of getting a result as extreme or more extreme than the one observed if the proposed null hypothesis is correct.
A small p-value provides evidence against the null hypothesis, because data have been observed that would be unlikely if the null hypothesis were correct. Thus we reject the null hypothesis when the p-value is sufficiently small.
The 0.05 level is simply a convenient cutoff value adopted by convention. Values close to 0.05 provide moderate evidence against the null hypothesis, while values less than 0.01 provide considerable evidence against the null hypothesis.
One approach is to decide before you do the study what p-value you will use to reject or not reject the hypothesis. This is called the significance level of the test (e.g. significance level = 0.05) denoted by a, the Greek letter alpha.
Then the hypothesis is rejected if p-value < significance level a and the data are said to be "statistically significant" at level a.
The null hypothesis is contrasted with the alternative hypothesis, denoted by H1 or HA, which usually refers to other possible values for the population parameter. It is a statement we hope or suspect is true instead of H0.
H1 may not specify a unique value for the population parameter. It can be a range of values and can be one-sided or two-sided.
100) then the
p-value is the probability of observing a statistic as extreme as or more
extreme than that actually observed, in either direction.
The test is called two-tailed (or two-sided)
e.g. compare H0 : µ = 100 vs H1 : µ > 100
Then p-value = P(statistic as extreme or more extreme than observed value only in the direction consistent with H1).
The test is called one-tailed or one-sided.
If you do not have a specific direction firmly in mind in advance, use a two-sided alternative hypothesis. It is rarely correct to use a 1-sided test in practice.
i) Reject H0 (because the p-value is small) when H0 is true
ii) Do not reject H0 (because the p-value is not small) when H0 is false
| TRUTH (unkown) | |||
|---|---|---|---|
| DECISION | |||
| H0 true | H0 false | ||
| Do not reject H0 | Correct Decision | Type II Error | |
| Reject H0 | Type I error | Correct Decision | |
significance level = P(type I error given H0 is true)
power = 1 - P(type II error given H0 is false)
Ideally studies should be designed so that power, 1-b, is at least 0.8. This requires using an efficient design and a sufficiently large sample.
when s is not known
~ N(µ, s
2/n). When s is not known we
estimate the standard deviation using the sample data.
So instead of calculating a test statistic
Z =
we use t =
.
The test statistic
t which is calculated using s instead of s is not normally distributed.
As both
and s are
calculated from the data and hence are both random variables, t
is the ratio of two R.V.s and is more variable than Z.
The sampling distribution of t was derived by W. S. Gosset who wrote using the pseudonym "Student" - hence it is sometimes called Student's t-distribution.
The t-distribution has a similar shape to the normal distribution,
but is somewhat flatter and has more area in the tails than the normal
distribution. The shape depends on the "degrees of freedom"
(n-1), where n is the sample size, so it is often written as t
or tn-1.
Comparison of the t1 and t5 distributions with the standard normal distribution, N(0,1):
N(0,1) t5 t1 Mean 0 0 0 Variance 1 5/3 inf. Skewness 0 0 0 Kurtosis 3 9 inf.
You can look up the t distribution either as a conventional table or, better, as a programmed function.
P(t5 > 2.015) = 0.05 P(t20 > 1.725) = 0.05 P(t50 > 1.676) = 0.05 P(tinf > 1.645) = 0.05
Compare this with
P(Z > 1.645) = 0.05 where Z ~ N(0, 1)
Thus the tn distribution -> N(0, 1) as n ->
.
For a random sample size n drawn from a population where X ~ N (µ, s 2), the t statistic
has the t distribution
with n-1 df,
where
is the sample mean
and s is the sample standard deviation
| ... Previous page | Next page ... |