• Laura Whiting

Abbreviating Commands, Options and Variable Names in Stata

Stata allows abbreviations of commands, variable names, and options. This is often handy when entering commands from the command line during your exploratory analyses. However, we recommend that the full (unabbreviated) versions should always be used in the final versions of your do-files.


As a general rule, command, option, and variable names may be abbreviated to the shortest string of characters that uniquely identifies them. This rule does not apply if the command or option does something that cannot easily be undone; the command must then be spelled out in its entirety: for example, clear, drop, replace.


Command Abbreviation

The shortest allowed abbreviation for a command or option can be determined by looking at the command’s syntax diagram. This minimal abbreviation is shown by underlining:

If there is no underlining, no abbreviation is allowed. For example, replace may not be abbreviated, the underlying reason being that replace changes the data with no easy way of changing it back.


On the other hand, regress can be abbreviated to reg, regr, regre, or regres, or it can be spelled out in its entirety.


Option Abbreviation

Option abbreviation follows the same logic as command abbreviation: you determine the minimum acceptable abbreviation by examining the command’s syntax diagram. The syntax diagram for summarize reads, in part:

The option detail may be abbreviated d, de, det, …, detail. Similarly, the option format may be abbreviated f, fo, …, format.


If you were to use summarize with the options detail and format in their most-abbreviated form, it would look like this:

Variable-Name Abbreviation

Variable names may be abbreviated to the shortest string of characters that uniquely identifies them given the data currently loaded in memory.


Suppose your dataset includes two variables, dvrate, and drate. You could refer to the variable dvrate as dvrat, dvra, dvr, or dv. You could not refer to the variable as d as this is present in both variables and so cannot uniquely identify individual variables. You could refer to all variables that started with the letter “d” by using d*.

0 views

© 2020 by Survey Design and Analysis Services. 

  • LinkedIn
  • Facebook
  • Twitter
  • YouTube