A method of statistically testing a hypothesis by comparing data against values predicted by the hypothesis. The significance test considers two hypotheses, the null and the alternative, effectively testing whether or not there is significant reason to support the hypothesis or whether the results are random.
When we can’t prove the alternative hypothesis is significant, we don’t “accept” the null hypothesis but “fail to reject” the null hypothesis. This implies the data is not sufficiently persuasive for us to choose the alternative hypothesis over the null hypothesis but it doesn’t necessarily prove the null hypothesis either.
It is important to choose the applicable test for each scenario. Some examples include:
One-sample vs two-sample tests
There are a variety of different types of t-test and z-test, the most basic being a one sample one tailed test. A one sample test only looks at a single sample of the data to compare against the sample mean but you could use multiple samples. A two-sample test can also compare locations of two populations, usually with the hypothesis that the two populations are equal.
One-tailed vs Two-tailed tests
A one tailed test provides a p-value either testing whether the sample mean is significantly higher or significantly lower than the hypothesised mean, however a two tailed test checks both ways. If you are only specifically looking for one alternative hypothesis, then a one-tailed significance test could be sufficient and may be deemed preferable because it has a lower threshold for rejecting the null hypothesis than the two-tailed test. However, many statisticians argue you should always assess two possible alternative hypotheses because you can’t be certain about the direction of deviation from the null hypothesis prior to the experiment.
It is important to make the correct decision whether your significance test needs to be one or two tailed. We usually want a two tailed test looking for significance regardless of the direction and whether it matches our intuition. For example, if measuring whether a new initiative has increased productivity we shouldn’t rule out that it may have had the opposite effect. Or you might be testing how paying for extra golf sessions affects a golfer’s performance. The null hypothesis is the sessions make no difference, an alternative hypothesis is that they improve performance but there could also be another more unlikely alternative hypothesis that they worsen performance.