T-tests can be used with any sample sizes and the mean or standard deviation of the population do not need to be known. Although the t-test relies on the assumption of a normal distribution, its probability values are based on the t-distribution. The test is appropriate when either the population is normal or the sample is large enough that the Central Limit Theorem applies. A one-sample t-test compares the mean of one group against an hypothesised mean value, whereas a two-sample test compares the means of two groups.
Two-sample tests differ depending on the variances, before choosing which type of t-test to use the variances should be calculated to define them equal or unequal. If variances are unknown, you need to either assume them equal or unequal before deciding which procedure to follow. You might assume unequal variances if the two sample sizes greatly differ because variances tend to become smaller as sample sizes increase. There are also paired two-sample t-tests, which compare the means of two samples that can be directly related to each other (e.g. a weight loss experiment with the same participants’ before and after samples). Excel has tools for each of these:
To conclude whether a t-test supports or rejects the null hypothesis, a t-value test statistic is calculated and compared against the critical values at the relevant significant level. Unlike a z-test, t-tests have separate critical values for each sample size.
Common population variance
For a two-sample t-test, you can calculate the common population variance to determine whether or not you should assume equal variances. You firstly square the standard deviation of each of the samples to get their sample variances. Next, divide the larger sample variance by the smaller. If the result is less than 3 we can assume a ‘common population variance’, otherwise we should use the two-sample t-test assuming unequal variances.
An alternative is to use the f-distribution to compare variances, using the f-statistic of the two samples. The F.TEST function does this in Excel, if the result is greater than alpha we can assume a common variance between them.
The pooled estimate for common variance can also be calculated. It gives us a common variance for both the samples, the result will lie between the two variances but may be weighted closer to the sample with the most observations.
One-sample t-test worked example
We will compare the heights of our sample of 18 male athletes against the average height of males in the UK, which is 175.3cm.
Our sample mean is 178.2cm:
In most cases it is best to use a two-tailed test. In this example you may have the preconceived idea that heights will be above average but you should not rule out the possibility they could be below average.
The degrees of freedom for a one-sample test is simply n-1, 17 in this case. Next, we calculate the critical value using the Excel function T.INV(probability, deg_freedom), in this case the formula used is: =T.INV(0.99,17). The critical value is 2.567, so the critical range is between -2.567 and 2.567.
We can reject the null hypothesis at 1% significance if the difference between the sample and hypothesised means (178.2 – 175.3 = 2.93) is greater than or equal to the critical value multiplied by the sample standard deviation, divided by the square root of n:
The critical value of 2.567 multiplied by 0.85 = 2.18. As 2.93 >= 2.18 we can reject the null hypothesis at 1% significance, providing strong evidence against the null hypothesis.
We can also use these calculations for the t-value (or t-statistic), which is simply the 2.92 (sample mean – hypothesised mean) divided by the 0.85 (s/sqrt(n)) which gives us a t-statistic of 3.45.
The t-statistic is then used to quantify whether we support or reject the null hypothesis. Using the T.DIST Excel function, the degrees of freedom (17) and the t-stat (3.45) we can calculate the probability that the T random variable is >= t-stat: =1-T.DIST(3.45,17,TRUE) which equals 0.001529. As this is a two-tailed test we need to double that to 0.0030581 which is still way short of our 0.01 p-value and further confirmation we reject the null hypothesis.
Confidence interval for a one-sample t-test
To work out the confidence interval for a population mean, the lower limit is point estimate – (critical value) x ESE and the upper limit is point estimate + (critical value) X ESE where ESE is the estimated standard error of the point estimate.
For a t-test, the CONFIDENCE.T Excel function will quickly perform this calculation. The function has three required arguments: Alpha, the standard deviation and the sample size and you can deduct it from the sample mean for the lower limit or add it to the sample mean for the upper. Referring to our one-sample t-test worked example:
Lower limit: =178.2 – CONFIDENCE.T(0.01, 3.6, 18)
Upper limit: =178.2 + CONFIDENCE.T(0.01, 3.6, 18)
99% confidence interval: (175.74, 180.66)
Two-sample t-test worked example
This time we split the athletes into two groups based on age, with teenage athletes compared against those aged 20+.
Comparing the two samples, they both contain 9 observations but the Under 20s have taller average heights. As the skewness in each sample is between 1 and -1 we can be confident of an approximately normal distribution, the closer the kurtosis is to zero the more normal the distribution is likely to be too.
Next we need to check the variance to see whether we should assume equal or unequal variances. Using the F.TEST() function we can test the probability that the variances are not significantly different: =F.TEST(A2:A10,B2:B10). This returns 0.61, as this is significantly above Alpha for this experiment (0.05) we can assume equal variances. Had the value been below Alpha, we might choose to use the VAR() function on each array to confirm that we’d need to use an unequal variances test instead.
Now it’s time to use the Analysis ToolPak to choose the relevant test, in this case t-Test: Two-Sample Assuming Equal Variances.
Enter the cell ranges for Variables 1 and 2, in this case we have ticked that row labels are included in those ranges. Our hypothesised mean is zero, matching the null hypothesis that there is no difference between the heights of the two groups and we are using an Alpha of 0.05 so we’re working to a 5% significance level.
The output includes all the key statistics from the test, including the variances, degrees of freedom and the t-statistic but the most important in assessing your hypothesis is the p-value for the relevant tail. In this case the p-value for a two-tailed test is 0.187897.
This p-value essentially means there is around a 19% chance of seeing a discrepancy like this between the two age groups by random chance. As this exceeds or Alpha of 0.05 (5%) we do not reject the null hypothesis in this instance.
If you don’t require the additional detail from the Analysis ToolPak option, you can fast-track much of this process by going straight to the T.TEST() function. You simply input the two arrays, state whether the test is one or two tailed and state which of the three t-tests you’re choosing:
The T.TEST function would take you straight to the 0.187897 p-value for the two-tail test in this example.
Matched pairs t-tests
The matched pairs t-test compare means of two samples which are normally distributed and the observations can be paired naturally. The differences between the pairs are calculated for each observation and then those differences have a one-sample t-test applied to them to test for any significant difference between the means.