Last update: April 6, 2004 Stata: Programming Class Notes
Stata: Programming
Daver C. Kahvecioglu
ITS High Performance Computing Group
UNC Chapel Hill
research@unc.edu
Do-files
Rather than typing commands at the keyboard, you can create a disk file containing commands and
instruct Stata to execute the commands stored in that file. Such files are called do-files, since the
command that causes them to be executed is do.
A do-file is a standard ASCII text file.
A do-file is executed by Stata when you type do filename.
You can use any text editor to create do-files, or you can use the built-in do-file editor by typing
doedit, or by clicking on the do-file editor icon on the menu bar at the top.
By default, if any line in the do-file contains an error, Stata stops immediately and does not attempt
to execute the rest of the commands. If you want Stata to keep going even though something may
be wrong with the do file, add the nostop option:
do file, nostop
1
Last update: April 6, 2004 Stata: Programming Class Notes
Executing Stata in Background (batch) Mode in Windows
Open a DOS window and type
c:\stata\wsestata /b do myjob
assuming that Stata-SE is installed in the folder c:\stata and you have a do-file named myjob in the
current folder. When the do-file completes, the Stata icon in the taskbar will flash. You can then
click on it to close Stata. If you want to stop the do-file before it completes, click on the Stata icon
in the taskbar, and Stata will ask you if you want to cancel the job.
/b will make Stata open an ASCII text log named myjob.log
If you do not know where Stata is installed, you can right click on the Stata icon on the desktop and
click on Properties. You can also click on Start, then Programs, then right click on Stata to see the
location. For example, in ATN labs we should type
J:\.isis.unc.edu\pc-pkg\stata-80\program\wsestata /b do myjob
Macros:
Macros are the variables of Stata programs. A macro is a string of characters, called the
macroname, that stands for another string of characters, called the macro contents.
There are two types of macros: Local and global. Local macros are private to the program in which
they are defined. They cannot be accessed from outside that program. Global macros, on the other
hand, are public, and are accessible in any program.
sysuse auto
local set1 " weight foreign "
global set1 " weight foreign "
2
Last update: April 6, 2004 Stata: Programming Class Notes
Enclosing local macro names in single quotes exposes what they contain. The content of a global
macro is revealed when we prefix $ to the macroname.
regress mpg `set1'
regress mpg $set1
Try the following:
display `set1'
display $set1
display "`set1'"
display "$set1"
display `set1' resolves to display weight foreign. display displays strings and values of scalar
expressions.
display "What to display?"
Let's define a scalar pi:
scalar pi = 3.14
di pi
_pi is a system variable that has the value of pi in it.
di _pi
display weight foreign displays the most reasonable scalar that you can get out of
weight foreign that is weight[_n] foreign[_n], which is also weight[1]
foreign[1] since current observation starts from observation # 1.
If we enclose the content of a macro in double quotes, (for example, "`set1'" or "$set1")
then content of the macro is nothing but a string.
3
Last update: April 6, 2004 Stata: Programming Class Notes
Argument passing with do-files
Let's say you want to write a do-file that gets a data set name (technically, we are passing an
argument, datasetname, to the do-file here) from the user and produces the means of all the
variables in that data set. So, the user will type something like
do dofilename datasetname
and the result will be the summary statistics for the variables in the datasetname.
Our do-file should be something like this:
use `1'
summarize
Or,
args varname
use `varname'
summarize
1 and varname are the names of local macros.
Programs
When you type something, Stata first checks if it is a built-in command. If it is, Stata executes what
you typed. Otherwise, Stata checks if it's a defined program. If it is, the program is executed.
Otherwise, Stat looks in certain directories (the names of these directories can be seen by typing
sysdir) for a file that has the name you typed and the extension, ado. If the search is not
successful, we get the "unrecognized command" error. In this section we will briefly discuss
programs, which is the second type of object Stata thinks what we typed is. In the next section we
will briefly discuss ado-files, which is last thing Stata thinks what we typed is.
4
Last update: April 6, 2004 Stata: Programming Class Notes
Here is a sample program you define interactively:
program hello
Now, you will start typing the first line of your program named hello:
1. display "Hello World"
2. end
When you type end on line 2, line 2 becomes the last line of the program and thus program
declaration is ended.
Now program hello is loaded into memory and it can be executed by typing
hello
If you type your program in an editor and save it as a do-file, then you can load it by "do"ing your
do-file. If you want to run your program by "do"ing it, add hello as the last line of your do-file.
A program is not much different than a do-file.
Ado-files
To run a program you have to load it first. If you save your program with the .ado extension and
put it in certain directories, you do not have to load it. Stata will treat your program as a Stata
command once it finds it in one of those designated directories.
Some of Stata's own commands are written as ado-files. Rest of Stata's commands are buil-in
commands. You can tell if a command is built-in or not by typing which command.
. which table
C:\PROGRAM FILES\STATA8-SE\ado\base\t\table.ado
*! version 5.3.0 09oct2001
5
Last update: April 6, 2004 Stata: Programming Class Notes
. which tabulate
built-in command: tabulate
You can read the contents of table.ado by opening it in a text editor (they are ASCII text files),
or by typing
type C:\PROGRAM FILES\STATA8-SE\ado\base\t\table.ado
You can add your own commands by creating your own ado-files. An ado-file defines a Stata
command, even though there are some Stata commands which are not ado-files.
Stata looks for ado-directories in seven places, which can be categorized in three ways:
I. the official ado-directories, meaning
1. (UPDATES), the official updates directory
2. (BASE), the official base directory
II. your personal ado directories, meaning
3. (SITE), the directory for ado-files your site might have installed,
4. (PLUS), the directory for ado-files you personally might have installed,
5. (PERSONAL), the directory for ado-files you personally might have written, and
6. (OLDPLACE), the directory where Stata users used to save their personally written ado-
files; and
III. the current directory, meaning
7. (.), the ado-files you have written just this instant or for just this project.
. sysdir
STATA: /usr/local/stata8/
UPDATES: /usr/local/stata8/ado/updates/
BASE: /usr/local/stata8/ado/base/
SITE: /usr/local/ado/
PLUS: ~/ado/plus/
PERSONAL: ~/ado/personal/
OLDPLACE: ~/ado/
6
Last update: April 6, 2004 Stata: Programming Class Notes
Stata has a range command that is shipped with it. range generates a numerical range, which is
useful for evaluating and graphing functions. Here is the syntax for it:
range varname #first #last [#obs]
Let's create our own version of this range command: rangeours
program rangeours // arguments are n a b
drop _all
args n a b
set obs `n'
gen x = (_n-1)/(_N-1)*(`b'-`a') + `a'
end
Then save this as rangeours.ado file in the current directory.
Now, type
rangeours 100 1 2
Accessing results calculated by commands/programs
You can access the results of Stata commands after they are executed. In terms of the way their
results are accessed, there are 4 types of commands in Stata:
r-class commands (most commands)
e-class commands (estimation commands)
s-class and n-class commands (which are rarely used)
After running an r-class command, say summarize, type return list to get the list of saved
results such as the mean, variance, maximum, and minimum of the variable. After running the
7
Last update: April 6, 2004 Stata: Programming Class Notes
regress command, which is an e-class command, type ereturn list to get the list of saved
estimation results, such as the number of observations, degrees-of-freedom, r-squared, coefficient
estimates, etc. If you type creturn list, then you get the list of system values such as, today's
date, current time, current directory, current system settings, etc. You can also save your programs'
results. See Section 21.10 of User's Guide for how to do that.
Some Examples
Example 1:
Let's write the following in the do-file editor.
version 8 // This tells Stata the version under which this do-file is written
set more off // Now Stata does not pause every time the screen is full
log using /* unless you specify an explicit address, myjob.log is saved in the current
directory */ myjob, replace text
use http://www.stata-press.com/data/r8/cencus /// the command
continues on the next line/* , clear
log close
Some string processing:
upper(A): changes string A to uppercase A
lower(A): changes string A to lowercase A
word(A,n): returns the nth word in string A
substr(A,m,n): returns the substring of A that is mth through nth characters
index(A,B): returns the position of string A where string B is first found
8
Last update: April 6, 2004 Stata: Programming Class Notes
Example 2:
local logfilename1 =
upper( word(c(current_date),1) + ///
word(c(current_date),2) + ///
word(c(current_date),3) )
log using `logfilename1', text replace
Example 3:
Let's say we want to create a log file and want to name it as the name of the data set. c(filename)
gives us the name of the data set with the whole path to the data set. We are interested only in the
name of the data set. The following do-file first trims the extension off the name, and then trims the
path all the way up to the name of the data set. Note that we make use of the fact that the upper and
lower cases of "." and "/" are the same.
local logfilename2 = ///
substr( "`c(filename)'", 1 , index("`c(filename)'", ".") - 1)
di "logfilename2 = " "`logfilename2'"
local i 0
while ///
upper(substr(reverse("`logfilename2'"),`i'+1,1)) != ///
lower(substr(reverse("`logfilename2'"),`i'+1,1)) {
di "i = " `i'
local ++i // equivalently, local i = `i' + 1
}
di "final i = " `i'
local logfilename2 = substr("`logfilename2'",-`i',.)
log using `logfilename2', replace text
9
Last update: April 6, 2004 Stata: Programming Class Notes
Example 4:
forvalues repeatedly sets local macro macroname to each element of range and executes the
commands enclosed in braces.
forvalues x = 1/10 {
if mod(`x',2) {
display "`x' is odd"
continue
}
display "`x' is even"
}
foreach repeatedly sets local macro macroname to each element of the list and executes the
commands enclosed in braces. In Example 5 the list is "newlist" (we are creating new variables)
and in Example 6 the list is a "numlist" (we are doing things for each number in the number list).
Example 5:
foreach var of newlist z1-z20 {
gen `var' = uniform()
}
su
10
Last update: April 6, 2004 Stata: Programming Class Notes
Example 6:
foreach num of numlist 1(1)4 6(2)13 {
if mod(`num',2) {
display "`num' is odd"
continue
}
display "`num' is even"
}
Example 7:
clear
set obs 100
*Generate 10 uniform random variables named x1, x2, ..., x10.
set seed 12345
forvalues i = 1(1)10 { // equivalently 1/10, or 1 2 to 10
generate x`i' = uniform()
qui cou if x`i' < .1
display " % of x`i' < 1/10 = " round(100*r(N)/_N,.01)
gen x`i'ltdec = x`i' < .1
}
sum x*dec
Do the same for obs 1,000, and for 1,000,000.
11
Last update: April 6, 2004 Stata: Programming Class Notes
Example 8:
We can do exactly the same thing by using a while loop:
set seed 12345
local i = 1
while `i' <11 {
generate x`i' = uniform()
qui cou if x`i' < .1
display " % of x`i' < 1/10 = " round(100*r(N)/_N,.01)
gen x`i'ltdec = x`i' < .1
local i = `i' + 1
}
sum x*dec
12
Last update: April 6, 2004 Stata: Programming Class Notes
SAS data sets as inputs and outputs to Stata on SUNNY
savas
When you are on sunny, you can convert a SAS dataset (say, filename.sas7bdat) into a Stata data
set, by simply typing:
savas filename.sas7bdat
Then, sunny will create filename.dta for you.
You can also convert a Stata data set (say, filename.dta) into a SAS data set, by simply typing
savas filename.dta
Then, sunny will create filename.sas7bdat for you.
usesas & savasas
These are two user-written Stata programs installed on sunny. In other words, they are two user-
written commands that can be run within Stata running on sunny.
usesas allows you to read a SAS data set directly into Stata:
usesas using filename.sas7bdat
savasas allows you save the current data set in Stata's memory as a SAS data set:
savasas using filename.sas7bdat
13
Last update: April 6, 2004 Stata: Programming Class Notes
savas, usesas, and savasas are all written by Dan Blanchette of UNC's Carolina Population
Center. You can install these programs on your personal copies of Stata as well. For more
information please see the following Stata Resources section.
sas2stata
When you are on sunny, there is another simple way you can convert a SAS dataset (say,
filename.sas7bdat) into a Stata data set:
sas2stata filename.sas7bdat
Then, sunny will create filename.dta for you.
Sas2stata is a Unix utility - written by the RAND Corporation - which converts SAS data sets into
Stata format. For more information in sas2stata on sunny, visit:
http://www.unc.edu/atn/hpc/applications/index.shtml?id=4208
14
Last update: April 6, 2004 Stata: Programming Class Notes
Stata Resources
There are several excellent Stata resources available on the Internet. They include program
databases, discussion forum, and task-specific user web sites. Here I will list only a few of them:
• Stata Corporation (www.stata.com)
This is a good place to find wealth of information about Stata. It offers wealth of information about
new features of Stata, a large frequently asked questions database extensive selection of
procedures, as well as links to other Stata-related sites. Some of the useful pages within this site
are:
FAQ page: http://www.stata.com/support/faqs/
A large and very useful database for frequently asked questions about statistics, data
management, graphics, programming, etc.
A list of resources such as Tutorials, FAQs for learning Stata:
http://www.stata.com/links/resources1.html
• UCLA Academic Technology Services: Resources to help you learn and use Stata:
http://www.ats.ucla.edu/stat/stata
Hosted by UCLA, this site provides an extensive resource of Stata information including FAQs,
learning modules, quick reference guide, annotated output, textbook examples, and more.
• Statalist: http://www.stata.com/support/statalist/faq/
Statalist is an active group of users who exchange information via email about using Stata. This is
where you can ask questions about Stata and statistics, and get some help and guidance.
• Carolina Population Center's (CPC) Stata learning resources:
Stata Tutorial: http://www.cpc.unc.edu/services/computer/presentations/statatutorial/
A SAS User's Guide to Stata:
http://www.cpc.unc.edu/services/computer/presentations/sas_to_stata/
In the above page, you will also find links to CPC's programs that convert between SAS and
Stata data sets such as savastata, savasas, usesas, and savas.
• The Odum Institute at UNC-Chapel Hill offers free Stata classes at Manning Hall at the UNC-CH
campus. Visit http://www2.irss.unc.edu/irss/shortcourses/shortcourse.asp
15