Importing, Loading and Saving Data in Stata
Stata allows you to import data from other file types using the import command. The save command is used to save your data in the Stata .dta format, and the use command is used to load data stored as .dta files into Stata. The clear command will clear any currently loaded dataset without saving it. Theses commands can also all be accessed from the Stata menus.
The most commonly imported files are MS Excel (.xls/.xslx) and Comma Separated Values (.csv) files, however you can also import: fixed format or unformatted text data (.txt); database files (.dbf); Database ODBC from the source; SAS XPORT data from version 5 or 8(.xpt/.v8xpt); datasets directly from the Federal Reserve Economic Database (from Stata v15 and higher); datasets directly from SPSS (.sav, from Stata v16 and higher); datasets directly from SAS (.sas7bdat, from Stata v16 and higher). When importing data I recommend first saving a separate copy of your dataset untouched, as some analysis in Stata can permanently change the dataset. If you don’t keep a separate copy you may lose information.
How to Use:
To import an MS excel file:
To import a comma separated values file:
To save a file in Stata .dta format:
To load a previously saved Stata .dta dataset:
To clear a dataset from memory without saving it:
In the examples above the sheet(sheetname) option for excel files is not needed if there is only one sheet in your excel file. The firstrow option tells Stata that the first row of data in your excel spreadsheet are all variable (column) names.
In this example I have two datasets that I want to import into Stata and save in .dta format. I have downloaded both these files from the data.gov.au website so they are located in my downloads folder. I will include the links to these datasets at the end of this example should you wish to have a look at them.
One file is an excel file showing the daily AUD exchange rate from January 1989 to June 2013, and the other is a .csv file containing the Congestion Levy Rates for 2015. I am going to import and save both datasets. In the command pane I type the following:
This produces the following in the results pane in Stata:
A few things to note here. You can save and load files without having to specify the whole path if you are saving or loading within your current working directory. See our tech tip on working directories to learn how to navigate files and folders using Stata. I have put quotation marks around all of my file names and paths, however this is really only necessary if there is a file name or folder name that contains spaces.
When importing the excel file I have added two options, cellrange() and case(). The cellrange option was added because when I opened the excel file I noted the actual data did not start until row 12, and the variable names were on row 11. So I added the cellrange() option to start importing at A11, and then the firstrow option to import that first row at A11 as variable names. I could also have removed the firstrow option and just put cellrange(A12).
Finally, at the end when I re-loaded the previously saved exchange rate data I did not first clear the data from memory. This is because when you are loading a dataset over the top of one already in memory you can add clear as an option to the use command. This will clear the data in memory and load your specified file all at once.
For more information about these commands simply type help command into the command pane in Stata, replacing command with the appropriate command name.