The Sort and Order Commands

The sort and order commands are useful for organising your data so it is easier to analyse and you can see patterns more easily.

The sort command orders the observations in ascending order according to the specified variable(s). If you specify more than one variable, the data is first sorted on the first variable, and then within those observation categories of the first variable they are sorted on the second variable. It assumes all observations in a row are mutually exclusive.

The order command relocates a variable to another place in the dataset. You can put a variable (column) at the beginning or end of the dataset, as well as before or after another variable in the dataset. The default for this command is to move the variable to the beginning of the dataset.

How to Use:

Worked Example:

I will use the auto dataset to demonstrate the sort and order commands. I will first use the list command to have a look at how my two variables are set up prior to sorting and ordering. In the command pane I type the following:

This gives me the following output:

The describe command allows me to see the current order of my variables, and the list command with an [in range] qualifier shows me the current order of the first 10 observations in my dataset. I am now going to use the sort and order commands to alter the order of my variables and observations. In the command pane I type the following:

You can see by comparing the describe and list outputs with each other the changes that have occurred. In the first describe the order of variables is "make price rep78 mpg", and in the second it has changed to "make rep78 price mpg". This change is due to the order command. In the first list we see the observations appear ordered alphabetically based on the "make" variable. In the second list the order of observations has changed to be in order of "rep78" values (from 1 to 5), with observations further ordered alphabetically by "make" within each category of "rep78". So now the cars with a repair record of 1 all appear at the top, and they are all in alphabetical order.

Note: There are some observations in this dataset that are missing "rep78". Stata treats all missing values as being larger than any other value in the dataset. For this reason, if you examine the auto dataset after it has been sorted above, you will find the cars that are missing "rep78" are all at the bottom of the dataset.

42 views0 comments

Recent Posts

See All