Getting Started in Stata - Student t-test
Today we are going to show you how to perform a student t-test in Stata using both drop-down menus and Stata commands. As a new Stata user it is recommended that you start by using the Stata menus to perform your analysis. Each analysis, such as a t-test, will show up in your Review pane (on the left side of the Stata screen) as the equivalent Stata command. This allows you to begin learning the general structure of commands and how to use them. Once you become more familiar with commands you will find it is faster and easier to perform your analyses using commands.
I am going to use the Stata Example dataset auto.dta. I load this dataset using the following command:
Student t-test via Menus:
Statistics > Summaries, tables, and tests > Classical tests of hypotheses > t test (mean-comparison test)
Select “Two-sample using groups”, then select “mpg” under Variable name, and select “foreign” under Group variable name. Leave the confidence level as 95, which indicates a significance (p-value) of 0.05 as the cutoff.
Student t-test via Stata Commands:
Stata prints the following to the Results pane (in the centre of the Stata window):
For this example I have performed a two-sample t-test on the “mpg” variable. To identify my data into two groups (samples) I have selected the “foreign” variable. This variable records whether a car was made domestically or overseas. I am using the t-test to determine whether there is a significant difference between the mpg’s of foreign cars versus domestic cars. For this t-test I am assuming equal variances between groups, however you can specify if you have unequal variances for a more accurate ttest.
The first column, “Group”, allows us to identify individual statistics for each group, as well as combined statistics for both groups and the difference between the two groups. For this t-test our two groups are foreign-made and domestic-made. The second column, “Obs”, lists the number of observations we have. In this case we have 52 domestic cars and 22 foreign cars for a combined total of 74 cars. As there is quite a difference in sample sizes between the two groups this would normally prompt me to check for unequal variance before running this test. If you are concerned about your variances you can perform a variance-ratio test using the command sdtest. To learn more check out this Tech Tip: Testing Variance between Two Groups.
The third column, “Mean”, shows the average mpg for each sample, a combined average, and the difference between the two sample averages. The fourth column, “Std. Err.”, is the standard error of the mean. The fifth column, “Std. Dev.”, indicates the standard deviation from the mean. Finally, the last two columns show the negative and positive confidence intervals, for which there is a 95% certainty the mean is contained in that range.
Below the table are some statistics relating to the t-test. The “diff” shows the difference was calculated as the foreign mean subtracted from the domestic mean. The “t” shows the t statistic for this t-test, which is used to calculate the p-value (significance level). The “Ho” shows the null hypothesis, in this case that the difference between domestic and foreign means will be zero. Across from the null hypothesis is the degrees of freedom (used with the t statistic to calculate the p-value), which for this test is 72. The last line shows three different alternate hypotheses. The middle alternate hypothesis is “diff != 0” which states that the difference between means does not equal zero. This hypothesis has a p-value of 0.0005, which at a significance level of 0.05 means there is a statistically significant difference between the means of the two groups (foreign and domestic).
The two alternate hypotheses on the left and right indicate whether the difference is greater than or less than zero, and allow you to draw some more specific conclusions about your data. It is statistically significant that the difference is less than (<) zero (equation on the left) but not statistically significant that the difference is greater than (>) zero (equation on the right). We can then use the original “diff” equation to determine that the mean mpg of foreign cars is statistically significantly higher than the mean mpg of domestic cars. Based on this t-test, foreign cars get better mileage.