Graph Overlay - Contour Graph with Square Levels and Scatter Graph in Stata
A basic plot such as a scatter or line plot can be overlaid against a contour plot in Stata to provide extra depth to the data. The contour plot can be used to determine square sections with different levels. An example of this is shown below:
In order to generate the coloured contour background to the scatter plot, you have to create an x-axis vs y-axis cross in which all the numbers on the x-axis intersect with all the numbers on the y-axis. Once this list of numbers is created you can then label them and use them to generate the square contour background. Some simple math will help us determine what numbers to use.
Please note: this works best with small number ranges. For example, if you have a variable that ranges from 1000 to 4000, the number of observations that Stata has to iterate over to create the contour graph will be very large and will take a long time to generate. A different approach may be needed for these types of variables.
How to Use:
Step 1: Load the data you will generate the scatter with.
Step 2: Genreate a simple scatter to get a quick understanding of the data and use it to perform a few simple calculations.
Step 3: Calculate three numbers using the following equations:
subtract the bottom number of the y-axis from the top number of the y-axis and add 1
subtract the left-most number of the x-axis from the right-most number of the x-axis and add 1
multiply the results of equations 1 and 2
Step 4: If the total number of observations in your dataset is less than the result from equation 3 (step 3), increase your observations to equal equation 3 with the following command:
Step 5: Generate two sequences of numbers in a way that all the numbers intersect with all the other numbers in each sequence.
Step 6: Create a new variable that will label each “group”. In the graph above I have four groups, so my new label variable started out by labelling everything as the number four.
*Note: For this type of contour to be meaningful, each coloured square should represent a level. In the above graph I have specified this as “very small” in the first and lightest shade of red, all the way up to “large” in the darkest shades of red. While these are labelled, the raw data is numeric. So large=4, medium=3, etc.*
Step 7: Replace the variable “label” with the next group number down from the top, if both the x and y variables fit within that group.
For example, in the graph above the next group down is “medium” which is the number 3. The top cutoff for this group is y=220 and x=20, so we replace label with 3 if the y-variable is less than or equal to 220 as well as the x-variable being less than or equal to 20.
Step 8: Perform Step 7 until all groups have been specified.
Step 9: Generate the contour graph with the scatter graph.
Using the auto dataset that comes with Stata, I want to classify car size based on the length of the car and its trunk size. I am going to do this graphically, using the method outlined above. First, I generate the scatter of length against trunk by typing the following into the command pane:
From this I get the following graph:
I then calculate the three equations as follows:
Equation 1: 240-140+1=101
Equation 2: 25-5+1=21
Equation 3: 101*21=2121
From here, I set the number of observations as the result of equation 3, and then generate two new variables. In the command pane, I type the following:
Now I create my final variable, called “label”. This assigns an integer label to each group of cross-sections. For this analysis, I have decided on four groups, numbered 1, 2, 3 and 4. Group 4 includes any length of 221 and over that intersects with any trunk of 21 and over. Group 3 includes any length between 201 and 220 that intersects with any trunk between 16 and 20. Group 2 includes any length between 181 and 200 that intersects with any trunk between 11 and 15. Group 1 includes any length of 180 or less that intersects with any trunk of 10 or less. I create the “label” variable to define these groups, and in the command pane I type the following:
I have now finished creating the variables needed for the contour graph. All that is left to do is overlay the contour with the scatter graph. To do this, in the command pane I type the following:
This generates the following graph:
Finally, if you want to separate the cars in the scatter based on a categorical variable you can do that too. For example, I want to show the difference between domestic and foreign cars on the graph above. To do that I generate a new graph by typing the following into the command pane:
Which generates the following graph, shown at the start as well: