• Laura Whiting

The estat ovtest Command - Linear Regression Post-estimation

This command performs the Ramsey (1969) RESET test (REgression Specification-Error Test). This test is specifically for linear regression models. There are three reasons for a specification error in a linear regression model. The first is that the model contains unnecessary extra variables; the second is that the regression model is missing an important variable; and the third reason is that there is functional form misspecification (the type of regression you are using cannot adequately explain the relationship between variables). For a linear regression, functional form misspecification indicates you are applying a linear model to non-linear relationship(s).

This RESET test can be used to look for two types of misspecification. It can look for either omitted variables, or functional form misspecification, depending on the options you choose when you run the test.

How to Use:

*Perform linear regression, then

OR to check for nonlinearity (functional form misspecification)

Worked Example:

In this example I use the auto dataset. I am going to generate a linear regression, and then use the estat ovtest command to look for specification errors.

In the command pane I type the following:

This gives the following output in Stata:

Here you see the output from the regression analysis, followed by the RESET omitted variables test. Using a significance p-value of 0.05 the RESET test is not significant, indicating there are no omitted variables in the model.

If you did this test and it indicated you were missing an important variable, the easiest way to find out what you might be missing is to look for linear relationships between other variables (not in your model), and the dependent variable. If you don’t have that many other variables you can do this visually using the graph matrix command. You can also look for correlations using the correlate command.

If you have included all relevant variables and you are still getting a significant p-value in this test, it means the omitted variable is likely one you do not have data for. This is usually due to an inability to collect data for this variable. In this case it indicates you are missing a significant influence on your dependent variable and therefore the model is of limited use.

I am going to perform the RESET test again, this time using the rhs option. This will let me know if there is any detectable non-linearity in my regression model.

In the command pane I type the following:

This gives the following output in Stata:

Here you see the output from the second RESET test. Again using a significance p-value of 0.05, this time we do get a statistically significant result. This indicates there is non-linearity present in the model. It suggests a linear regression model is not sufficient to explain the relationship between the dependent and independent variables.

If you get a result like this you will need to do one of two things. You can choose to transform variables in different ways (for example transforming variable(s) to the natural logarithm) in order to force linear relationships between transformed variables. Or you can abandon linear regression altogether and look for a more suitable model for your data. How you choose to deal with functional form misspecification in your linear regression will depend on what you know about your data and how confident you are at transforming variables in meaningful ways.

1,116 views0 comments

Recent Posts

See All

© 2020 by Survey Design and Analysis Services. 

  • LinkedIn
  • Facebook
  • Twitter
  • YouTube