The Cut Function - Convert a Continuous Variable to an Ordinal Variable
There are times when you need to convert a continuous variable into an ordinal variable, for example to control the bins for a histogram. You can specify the range of each segment using the egen command with the cut function in Stata. For example, if you had a temperature variable you could group it into 10-degree segments (10-19. 20-29, etc.). You must always make sure that the maximum number you give, where the range ends, is higher than the maximum number in your variable. Otherwise the highest observations will not be included anywhere.
Here we give several examples of how you can use the egen command with the cut function to generate ordinal variables from continuous ones.
How to Use:
Worked Example 1:
For this example I am going to use the Stata auto example dataset. I want to convert the mpg variable from continuous to ordinal, with each section holding a 10-unit range. The minimum mpg in this dataset is 12, and the maximum is 41. In the command pane I type the following:
The tabulate command produces the following table in the results pane, which demonstrates what the cut function has done:
Worked Example 2:
In the previous example I separated mpg into four separate ranges (10-19, 20-29, 30-39, 40-49), which are saved in the new variable mpgbin. However, by looking at the table generated by tabulate I discovered that there are many more cars with an mpg in the 10-29 range than in the 30-49 range. Now I want to further separate the lower mpg values into groups of 5, while keeping the higher mpg values in groups of 10. To do this I have to be a bit more specific about the ranges I use in my egen command. In the command pane I type the following:
The tabulate command produces the following table in the results pane:
As you can see the groups now have a more even spread. In my command here, I specified intervals of 5 from 10 to 30, and then after 30 I put a bin up to 40 and another up to 50. This meant that I got the same bins for 30- and 40- as I did in the first example, while further breaking down the bins for under 30.
The cut function is a quick and easy way to convert your continuous variable to ordinal while also allowing you to control how the ordinal variable is set up.