Producing an Infographic in Stata – Part 3

An infographic uses pictures, occasionally with some words, to convey a piece of information. You can make infographics in Stata by utilising special fonts. These fonts have images that are assigned to letter characters, so for example when you put an “A” in the font you will see a picture of a car. Many different fonts are available for free through different font websites. You can also create your own if you know how.


Here I am going to create a series of infographics using the Stata example dataset nlsw88.dta. This dataset contains an excerpt from the 1988 National Longitudinal Survey of Young Women and Mature Women (NLSW). I will use the font Silhous from fontspace.com. You can download this font to follow along here: https://www.fontspace.com/silhous-font-f3542. Make sure to open and install the font before continuing.


How to Use:

To attach a font in Stata you use a macro. The set up of the macro is as follows:

`"{fontface "font name":chosen_character(s)}"'

In order to have these show on a graph you can either add this in a text box using the graph editor, or you can create a value label that uses the above format to label values with the chosen font. In this example we are going to use value labels.


Worked Example:

In this example I am going to create the following graphic:

This graphic is comparing the average wage and the average age of women who are single (never married), married, and divorced. To create this graphic we first need to calculate the average wage and the average age for each of these groups. In the command pane:

sysuse nlsw88.dta, clear
egen swage = mean(wage) if never_married == 1
egen mwage = mean(wage) if married == 1
egen dwage = mean(wage) if never_married == 0 & married == 0
label variable swage "Wage ($ per hour)"
label variable mwage "Wage ($ per hour)"
label variable dwage "Wage ($ per hour)"
egen sage = mean(age) if never_married == 1
egen mage = mean(age) if married == 1
egen dage = mean(age) if never_married == 0 & married == 0
label variable sage "Age (Average)"
label variable mage "Age (Average)"
label variable dage "Age (Average)"

Now we have our average wages and ages, we need to create a label variable which labels each observation as either married, single, or divorced. In the command pane:

generate marsindiv = 1
replace marsindiv = 0 if never_married == 0 & married == 0
replace marsindiv = 2 if never_married == 1
label define marsinlbl 0 `"{fontface "Silhous":-.}"' 1 `"{fontface "Silhous":T}"' 2 `"{fontface "Silhous":E}"'
label values marsindiv marsinlbl

With our labels all set up we can now create our graph. I use the following graph command:

twoway (scatter swage sage, mlabsize(*20) mlabcolor(red) mlabel(marsindiv) msymbol(i) legend(off) ylabel(7(.4)9, nogrid) xlabel(38(.5)40)) (scatter mwage mage, mlabsize(*5) mlabcolor(purple) mlabel(marsindiv) msymbol(i)) (scatter dwage dage, mlabsize(*10) mlabcolor(blue) mlabel(marsindiv) msymbol(i))

In this command, I am super-imposing three separate graphs on top of each other. The three separate graphs are as follows:


Graph One

scatter swage sage, mlabsize(*20) mlabcolor(red) mlabel(marsindiv) msymbol(i) legend(off) ylabel(7(.4)9, nogrid) xlabel(38(.5)40)

Graph Two

scatter mwage mage, mlabsize(*5) mlabcolor(purple) mlabel(marsindiv) msymbol(i)

Graph Three

scatter dwage dage, mlabsize(*10) mlabcolor(blue) mlabel(marsindiv) msymbol(i)

The mlabsize() option is used to make the symbols different sizes to indicate where on the wage scale each category sits. The *# option for mlabsize() allows you to indicate relative sizes, with *2 being roughly double the size of *1. The mlabcolor() option specifies what colour I want the label to be.


The mlabel() option is how we are getting our fonts to appear as labels on the graph. The marsinlbl we created contains the information for the font we want to apply, and the marsindiv variable is where we apply that label so when we specify that the label variable is marsindiv, the label applied to the variable is what is used for display in the graph.


The msymbol(i) option is specifying what marker symbol we want. Here we specify i as we want our markers to be invisible – we only want to see the labels. The legend(off) option hides the legend, and the ylabel() and xlabel() options apply our chosen X-axis and Y-axis value ranges.

Check out our Graph Gallery for more examples of complex and interesting graphs made in Stata. You may also be interested in the book A Visual Guide to Stata Graphics, 3rd Edition, which is available for purchase through our website.

290 views0 comments

Recent Posts

See All