Stataquest Tutorial Programming Guide: Henrik Schmiediche June 1997
Stataquest Tutorial Programming Guide: Henrik Schmiediche June 1997
Stataquest Tutorial Programming Guide: Henrik Schmiediche June 1997
Henrik Schmiediche
June 1997
3 The Menu 15
4 Help Files 18
5 Graphics 20
5.1 Introduction and Overview . . . . . . . . . . . . . . . . . . . . . . . . 20
5.2 Initializing Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 Lines and Line Styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.4 Text and Fonts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.5 Multiple Lines and Text Strings . . . . . . . . . . . . . . . . . . . . . . 24
5.6 Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.7 Boxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.8 Arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.9 Snooze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.10 Interfacing with StataQuest Graphics . . . . . . . . . . . . . . . . . . 27
In general, it is not necessary to read the manuscript cover to cover before starting
on a StataQuest programming project. The best way to learn is simply to start
programming. The beginning of each section provides sufficient information to fa-
miliarize the user with the basic concepts and gives the complete set of features
i
available. When specific information is needed, the programmer can look up the
relevant section. A complete StataQuest program—consisting of a dialog box, a
help screen and a graphics program—is presented at the end of this manuscript.
The example programs and updates to this tutorial will be available on the web at
http://stat.tamu.edu/StataQuest.
Additional Resources
To do any kind of meaningful programming in StataQuest, the programmer needs
to know Stata 4—the programming language on which StataQuest is based. Users
should consult the Stata manuals for more information on programming in Stata 4.
Information about Stata 4 and Stata in general can be found at the corporate web site:
http://www.stata.com. The NetCourses Stata offers are an excellent introduction to
programming the Stata language.
Other material that a reader might wish to consult is the StataQuest 4 software
documentation written by J. Theodore Anagnoson and Richard E. DeLeon. Statistics
with Stata 5 by Lawrence C. Hamilton and StatConcepts: A Visual Tour of Statis-
tical Ideas by H. Joseph Newton and Jane. L. Harvill. All three of the books are
published by Duxbury Press. In particular, the StatConcepts book is a great exam-
ple of what is possible with StataQuest. The book and software consists of a series
of 28 StataQuest laboratories covering most aspects of introductory statistics. It is
highly recommended that any user who desires to program in StataQuest examine
the laboratories presented in StatConcepts: A Visual Tour of Statistical Ideas. More
information on StatConcepts and the other Duxbury books listed above can be found
at Duxbury’s web site at http://www.thomson.com/duxbury.html.
Acknowledgments
I want to thank James Hardin of Stata Corporation for patiently answering all my
StataQuest questions and for his numerous corrections and suggestions, Alan Riley of
Stata Corporation for his detailed proof reading of the manuscript and Joseph Newton
who shared his experiences in writing StatConcepts: A Visual Tour of Statistical Ideas
and provided invaluable assistance to me in learning and understanding StataQuest.
Thanks to Stan Loll, Curt Hinrichs and Cynthia Mazow at Duxberry Press for making
StataQuest and this tutorial possible. Finally, I would like to thank my wife Cindee—
a constant source of encouragement.
ii
1 StataQuest Programming
Even though we do not intend to to teach Stata programming in this tutorial (see
the preface for a list of good resource on learning Stata), we will review three aspects
of Stata programming that make StataQuest program different from many other
conventional programming languages. These are: data matrices, scalar values and
looping.
clear
local x = 1
local x = x + 1
[x not found]
does not produce the expected result. What is happening here is that first we assign
the value one to our macro x, and then we want to increment the value contained
within the macro x by one. Yet, what we are actually doing with the statement “local
x=x+1” is incrementing the variable x by one. The variable x is not defined—it would
be a column in our data matrix, if we had one. So we receive an error message. To
actually increment our macro x by one, we must refer to the contents of the macro.
The correct code to increment the macro x by one is:
clear
local x = 1
local x = ‘x’ + 1
display ‘x’
The “ ‘ ” in front of the x is the accent key in the upper left corner of most PC
keyboards. The “ ’ ” after the x is the single quote character.
Internally, Stata expands the macro in the statement “local x=‘x’+1” so that it
reads “local x=1+1” and assigns the result 2 to the macro x. The most common
1
mistake beginners make is no referring to the contents of the macros in arithmetic
expressions. For global macros, we use the dollar sign to refer to the contents of the
macro:
clear
global x = 1
global x = $x + 1
display $x
Note that it is possible to have a local and global macro with the same name. Each
contains its own value, and the values do not affect one another.
1.3 Looping
Looping is only possible in StataQuest by using a while expression-is-true do {} loop.
The while command is only legal in programs—it cannot be entered interactively.
The syntax is:
while {expression-is-true} {
{commands}
}
The expression must be false (zero) to break out of a while loop. Stata has no built–in
for or until looping commands, though these types of loops can be simulated. A for
i=1 to n do {} loop would look like this:
local i = 1
while ‘i’ <= 100 {
{commands}
local i = ‘i’ + 1
}
local i = 1
while ‘i’ {
{commands}
if {expression-is-true} {
local i = 0
}
}
Any type of looping can be constructed using the while command. There is no goto
statement in StataQuest.
2
2 The Dialog Box
In most StataQuest programs, the dialog box is the central interface to the user. The
dialog box displays instructions, receives user input, provides help, etc. This section
presents the StataQuest commands to create dialog boxes. We encourage the reader
to enter the StataQuest commands on the command line as the tutorial presents
them.
In this case, the wdlg command creates a dialog box titled Empty StataQuest Dialog
Box of size 200 (width) by 100 (height) units whose upper left–hand corner is located
10 units down and to the right of the upper left–hand corner of the main StataQuest
window (see Figure 1).
Before we continue, we need to define the dialog box coordinate grid. A unit
coordinate is based on the size of the standard StataQuest dialog box font. The font
is defined to be 8 units high and approximately 4 units wide. So, a StataQuest dialog
3
box of size 200 by 100 would hold approximately 50 (i.e. 200/4) characters per line
and 12.6 (i.e. 100/8) lines of text. The width of character is only approximately 4
units since the standard font is proportionally spaced. That is, narrow characters like
“l” take up less space than wide characters like “M”. The height of the font is exactly
8 units, and that never varies. The current version of StataQuest does not allow the
default font to be changed. It is important to note that all coordinates in StataQuest
are absolute and not relative to the size of the StataQuest window. Resizing the
StataQuest window or a dialog box increases the visible area of the coordinate grid,
but the unit size of the coordinate grid is not altered.
All StataQuest dialog box commands describe their location and size coordinates
using four numbers: xoffset yoffset xsize ysize . In our example above xoffset = 10,
yoffset = 10, xsize = 200 and ysize = 100. The xoffset and yoffset coordinate pair specify
the offset from the upper left–hand corner of the StataQuest or dialog box window.
The wdlg command that creates the dialog box itself defines (0, 0) as the upper left–
hand corner of the StataQuest window. All the other dialog box commands described
below define (0, 0) to be the upper left–hand corner of the dialog box itself. The xsize
ysize coordinate pair expresses the size of the window or other dialog box object in
the x (horizontal) direction and y (vertical) direction respectively.
Like all StataQuest windows, once the dialog box is displayed, users can move
the dialog box with the mouse to any place they desire within the main StataQuest
window. Currently StataQuest cannot have more than one dialog box open at a time.
To close the dialog box, click on the Windows “close window” button located in the
upper right–hand corner of the dialog box window frame in Windows 95 or in the
upper left–hand corner of the dialog box window frame in Windows 3.1).
4
The second line wdctl static mesg... places the text contained within the global
macro mesg into the dialog box. The coordinates 10 10 180 80 tells StataQuest that
the text is to be placed in an invisible box of size 180 by 80 with the upper left–hand
corner of that box offset 10 units from the upper left–hand corner of the dialog box.
StataQuest will format the text to fit within this invisible static text box. So, for
example:
Note that it may be simpler to type the global mesg macro as one long line instead
of splitting it up into five lines. StataQuest formats the text in the macro mesg by
breaking the text at white space characters—it will not split a word in the middle.
The text is filled into the static text box from left to right, top to bottom. Should
the static box be too small to hold all the text, the text is clipped to fit inside the
box. Under no circumstance will text be displayed outside the bounds of the static
box (10 10 180 80 in our example above).
By default, all text formatted within the static box is left–justified. To right–
justify text append the right keyword at the end of the wdctl static command.
To center the text within the static box use the center keyword. For example, to
center–justify the text above within the macro mesg type:
5
wdctl static mesg 10 10 180 80 center
The keyword for left–justified text is left, but it can be omitted since left–justification
is the default.
Multiple wdctl static commands can be issued prior to creating the dialog box
with wdlg. For example:
global mesg1 "This is the first line of text!"
global mesg2 "This is the second line of text!"
wdctl static mesg1 10 10 180 80
wdctl static mesg2 10 45 180 80
wdlg "A StataQuest Window" 20 20 200 100
will result in:
To place the two lines of text mesg1 and mesg2 next to each other (instead of one on
top of the other), simply change the wdctl command coordinates above to 10 10 60
80 and 100 10 60 80 as shown below:
The text for the wdctl static command must be contained within a global macro.
It is not possible to specify the string directly (e.g. wdctl static "text"... is not
allowed). Should the contents of the global macro change, then this change will be
reflected in the dialog box the next time it is redrawn. The dialog box always displays
the current contents of the macros—not the contents of the macros when the dialog
box is created. The wdupdate (windows-dialog-update) command can be issued to
force the dialog box to be updated (redrawn) immediately.
6
2.3 Edit Boxes
To get generic user input from within a dialog window, we use an Edit Box. An Edit
Box will create an input field within the dialog window. Any text entered into this
field by the user will be accessible by the StataQuest program. For example, to create
an input field or Edit Box as StataQuest calls them, you could type:
The user can move between the edit boxes using the TAB key. Standard editing
commands are available to modify the contents of the edit boxes: left and right arrow,
backspace, delete, etc. The data entered by the user are stored in the associated global
macros input1 and input2. The content of the edit boxes can be changed at any time,
and the global macros always contain exactly what is in the edit box.
The size and location argument syntax is the same as that of the wdctl static
text boxes. Remember that these arguments are expressed in terms of the font size.
An edit box height of 8 is natural since that is the height of a character. The length
of 100 in our example would hold about 25 characters since each character is about
4 units wide. A single edit box cannot handle multiple lines of input.
The size arguments of the wdctl edit command restrict the size of the edit box
but do not restrict the length of the text since the input will scroll inside the edit box
to accommodate any amount of input. To restrict the input length, add the keywords
maxlen # to the end of the wdctl edit command where “#” is the maximum allowable
input length in characters. For example, to restrict the two edit boxes in our example
above to 10 and 15 characters of input, use the following StataQuest commands:
7
2.4 Check Boxes
A Check Box is an on/off switch (boolean switch). The on/off switch derives its
name from the fact that it gathers information from the user by indicating one of
two possible states (i.e. on or off). For example, if we allow the users to select three
options to our code, we place three check boxes on the dialog window, and the users
check which of the three options they desire (each option is selected independently of
the others). For example, the following code:
global cbox3 1
wdctl check "Check Box 1" 10 10 48 8 cbox1
wdctl check "Check Box 2" 10 20 48 8 cbox2
wdctl check "Check Box 3" 10 30 48 8 cbox3
wdlg "A StataQuest Window" 20 20 200 50
To toggle a check box move the mouse pointer over the check box and click on the left
button. Note that we created our three check boxes with three calls to wdctl check.
The third argument ("Check Box 1") is the text associated with the check box. The
last argument (cbox1) is the global macro that StataQuest will set to 0 or 1 depending
on if that particular check box is checked or not. Since we set the macro for Check
Box 3 to “1” before creating the dialog window with wdlg, that check box was turned
on by default. The size arguments were set to 48 8 since the height of a check box
is 8 (the height of a character) and the width of the check box in our example is 12
characters (11 characters plus the check box itself for a total of a 12 characters, or
a 48–unit check box length). The optional ending arguments right or left indicate
the check button location—to the right or left of the text string.
8
global vradio 3
wdctl radbegin "Button 1" 10 10 36 8 vradio
wdctl radio "Button 2" 10 20 36 8 vradio
wdctl radio "Button 3" 10 30 36 8 vradio
wdctl radend "Button 4" 10 40 36 8 vradio
wdlg "A StataQuest Window" 20 20 200 50
would produce this dialog window with a four–way radio button setup:
To select a radio button, move the mouse pointer over the radio button of choice and
click on the left mouse button. Three different wdctl commands are used to create
a radio button group. A radio button group is started with wdctl radbegin, and
the group is ended with the wdctl radend command. All other intermediary buttons
are created using the wdctl radio command. As with the Check Boxes, the third
argument ("Button 1") is the text placed next to a particular radio button. The
size and location arguments, as with all other wdctl commands, describe the offset
and size of the radio button within the dialog window. Using similar reasoning as
with the Check Boxes above, the length argument of 36 was chosen since we have
8 characters in the the description string (“Button 1”) and the radio button itself
occupies one character position. Thus, we have a 9–character, or a 36–unit width, for
this radio button. As before, the height of 8 units reflects the height of a character
in the StataQuest dialog window. The global macro vradio will contain the number
of the radio button that is checked by the user (1 through 4 in our example above).
Note that you must use the same global macro name for all of the radio buttons in
a group. The optional ending arguments of right or left indicate the radio button
location—to the right or left of the text string.
2.6 Buttons
Buttons are crucial in StataQuest; they direct the flow of the program. A button, if
pressed, calls a StataQuest program. Most StataQuest programs need at least one
button, but three is more common: Run, Exit and Help. In general, the flow of the
StataQuest program is set up so that when the Run button is pressed, the information
gathered in the dialog box is processed, and the results are displayed. When Exit is
pressed, the dialog box exits back to the main prompt; and when Help is pressed, we
9
get a help screen describing the StataQuest program or the current dialog window.
For example:
would produce this dialog window with the three user buttons described above:
In this example, we create three buttons at locations (10,10), (40,10) and (70,10).
The button text (argument 3) is centered within the button box. The last argument
is a macro containing the StataQuest routine to call when that button is pressed.
In our example we call exit 1234 when the Exit button is selected (this command
simply closes the dialog box). When Help is pressed, we call the help function which
will complain "help for help not found." The Run button will display a stopbox
(see below). The last required argument in wdctl button contains the StataQuest
command executed when the corresponding button is pressed. The command must
be contained within a macro—you cannot specify the macro string directly. For
example, the Exit above could not be specified as:
since "exit 1234" is not a macro. We must first assign the StataQuest command
"exit 1234" to a macro (mbut2), and then we use that macro in our "wdctl button"
command.
Note the optional keyword help on the third wdctl button command. The help
keyword binds the function key “F1” to the button macro (in this case mbut3). This
way, a user can press “F1” and receive the help screen that corresponds to the current
dialog box. In other words, pressing “F1” is now the equivalent of clicking on the
help button.
10
2.7 Stop Boxes
Stop Boxes are a convenient way to display error or critical informational messages
to the user. In the example above it is used to inform a user that we have not yet
implemented the Run command with a stop box. Another example would be:
sstopbox stop "Sorry, this has not yet" " been implemented!"
Note that we can place up to 4 lines of text in the stop box. A stop box always
returns with an error code of 1 (the result code).
2.8 Frames
Frames are used in a StataQuest program to separate the main window into sections.
Essentially, a frame command draws a black rectangular frame in the dialog window.
To draw a frame, we use the same wdctl static command as we do when displaying
text; except, we add the keyword blackframe as the last argument. For example, to
draw a black frame around text one could use the commands:
The third argument (mesg) is a dummy argument that does nothing in the first wdctl
command. It is needed, however, as a placeholder for the third argument. The
11
location and size arguments are specified in the same manner as all wdctl commands.
Since in our example we are drawing the frame around some text, we make the
box 2 units taller and 2 units longer in each direction so that the text box can be
comfortably displayed within the frame. Drawing the frame erases the contents of
the entire frame box so it is important that the frame be drawn first. Switching the
two wdctl static commands above would result in a frame without any text—the
text is erased when the frame is drawn.
To select a list box choice, simply left click on that list box element. Use the scroll
bar to scroll through all the available choices. Users can also type the preferred choice
(or anything) in the edit box at the top of the list box. Selecting one choice erases
the previous choice in the list box. See below for an alternative to this behavior.
The syntax of List Boxes is straightforward. The third argument of the wdctl
ssimple command (vresult) is a macro name which contains both the initial default
value for the list box and the result. The fourth argument (vlist) is a macro contain-
ing the list of choices for the list box. The size and location parameters (10 10 40 36)
are interpreted as usual. Our example height of 36 guarantees a list box where three
list box choices are visible to the user at a time. The rest of the choices are accessed
through the scrollbar on the right. The 36 is calculated as follows: 8 for the edit box
containing the list box choice, plus 8 times the number of list box choices that are
to be displayed, plus 4 overhead. In our case that is 36 (8+3*8+4). A height of 35
12
would result in StataQuest only displaying 2 list box choices, while a size of at least
44 would be required for a list box displaying 4 choices.
The list of selections contained in the third argument are separated by spaces.
Should the need arise to have a selection contain a space, it is necessary to specify a
different list separating character by adding the keyword parse(;) to the end of the
wdctl ssimple command. In the following example:
the three list choices are separated by a “;”, and each list choice contains spaces.
The list separation character can be any valid character. Note that no white space
may separate the parse from the “(;)”—“parse (;)” is not valid syntax for the parse
keyword.
A variation on the Simple List Box above is the Multiple List Box specified by
the command wdctl msimple. The multiple list box is identical to the simple list box,
except that multiple selections can be chosen. Each selection is concatenated to the
contents of the list box as the user clicks on the list. If we wanted to allow the user
to select multiple options in the list box above, we could use the command:
...
wdctl msimple vresult vlist 10 10 40 36
...
(The rest is the same as in the simple list box example above.) Now the user can
select all the options desired from the list box. Each option selected will be appended
to the existing option. Standard Windows editing controls can be used to edit the
selected choices in the edit box. As with wdctl ssimple, the parse() keyword can
also be used as an option in the wdctl msimple command.
13
which would display in one of the following two ways:
The left image above is the combo box as it is originally displayed. Should the user
choose to expand the combo box by left–clicking on the downward pointing arrow we
would see the right image above. See the section on List Boxes for a discussion on
the size parameters of the list box which also apply to the combo box.
As with the List Boxes, Combo Boxes may be one of two types: simple and
multiple. The distinction is the same as with the List Boxes in that the Simple
Combo Box causes a user selection to erase any old selection, while a Multiple Combo
Box adds a user’s selection (possibly multiple selections) to the current selection. The
multiple combo box is specified using the wdctl mcombo command. For example, to
change the simple combo box above to a multiple combo box, we would change the
command:
to
Everything else would stay the same. The parse() keyword is valid for both the wdctl
scombo and wdctl mcombo commands. See the section on List Boxes for a description
of parse().
14
3 The Menu
The menu bar at the top of the StataQuest window is used as a launching pad to
many StataQuest programs, options and features. This menu bar can be customized
to suit a programmer’s needs. It is possible to add menu items to existing menu
bars or to change or even eliminate the menu bar. In this section we discuss how to
program the menu bar.
Internally, the default menu bar looks like this:
We see this menu bar only if we start StataQuest without calling the standard
sqmenu.ado file which is usually always called on startup (it is part of the command line
that launches StataQuest). The ADO file sqmenu.ado defines the standard StataQuest
menu bar to be:
Should the default internal menu bar be active instead of this one, simply type sqmenu
to load and activate the standard StataQuest menu bar. Every menu bar has a
name associated with it. The internal menu bar is called sysmenu, while the default
StataQuest menu bar is called StataQuest. To make the sysmenu menu bar active, use
the command:
The StataQuest menu bar is defined in the sqmenu ADO file located at $HOME
\sqado\s\sqmenu.ado where $HOME is the StataQuest home directory (by default this
will be c:\wstataq). This file should be examined for a detailed example of creating
a menu bar.
To create a new menu bar we use the command:
which creates the empty menu bar MyMenuBar. No changes are made to the current
menu bar until the wmenu set command is issued. The name of the menu bar can
be surrounded by quotes. It must be surrounded by quotes if the name contains
spaces (e.g. wmenu popout "My Menu Bar"). Activating an empty menu bar has the
interesting effect of eliminating the menu bar altogether. To try this, type:
15
wmenu set MyMenuBar
To add one or more items to the top of the menu bar use the wmenu append popout
command. For example, to add three items called Lab 1, Lab 2 and Help, type:
The last line activates the menu bar. The menu bar should now look something like
this:
Note that the ‘1’ and ‘2’ and the ‘H’ of Help are underlined; they represent shortcut
keys to the menu item. For example, if we type Alt-1, the pull down menu associated
with lab 1 is displayed. (Currently, there is no pull down menu defined, so Alt-1 has
no effect.) The optional shortcut key is defined by the ‘&’ in the menu item name.
The character that follows the ‘&’ is the short cut key to that menu or item.
Each popout menu item generally has further menu items attached to it. After all,
the purpose of the popout keyword is to create a new pulldown or flyout menu with
more menu items. The alternate keyword string is used to associate a StataQuest
action with the item itself. For example, to add a pulldown menu to the Lab 2 menu
bar item above, one could say:
16
Clicking on the Lab 2 menu bar item creates a pulldown menu consisting of three
items. The first two are flyout menus Menu 1 and Menu 2 which will contain further
menu items. The third entry, Item 3, will execute the StataQuest command command1.
This could be any StataQuest command, but it generally will be an ADO file. Finally,
note the third keyword separator that can be used with wmenu append. As the name
suggests, it creates a separator in the pulldown or flyout menu that can be used to
group menu items into logical units. The only purpose of the separator is aesthetic.
The syntax of the wmenu append commands is always as follows: the fourth argu-
ment (e.g. Lab 2) is the popout menu which this item is attached or appended. The
fifth argument is the name of the menu item (e.g. Menu 1), and the sixth argument
is the StataQuest command associated with the menu item.
A pulldown menu is created when one appends a menu to a menu bar item. Thus,
Lab 2 is a pulldown menu. A flyout menu is a menu that displays to the right of
a pulldown or prior flyout menu. (Note the arrows indicating a flyout menu in the
example above.) In other words, a first level popout menu item is added to the menu
bar, the second level popout menu is a pulldown menu and the third and following
levels are all flyout menus.
The flyout menus can be nested to multiple levels. For example, to add two items
to the the Item 1 flyout menu, which itself contains another flyout menu, one could
write the following:
To get the above display one would, of course, have to open all the appropriate menus.
17
4 Help Files
StataQuest has a hypertext–based help system that is user extensible. Help files have
the extension .hlp, and they must be located in the StataQuest search path in the
same manner as ADO files. A StataQuest help file is opened with the command
whelp. For example, the command:
whelp sqhelp
will display the introductory StataQuest help file sqhelp.hlp as seen in Figure 2. The
first few lines of the actual sqhelp.hlp file (located at \$HOME\sqado\s\sqhelp.hlp) are
displayed in Figure 3.
There are four components that make up the help file. First, we have the title:
.-
help for ^StataQuest^
.-
18
.-
help for ^StataQuest^
.-
In this help file -- or any other help file -- clicking on a green word or
green phrase takes you to a help file for that topic.
Try it. The phrase "click me" is in green below. Click on it.
Then click on the ^Back^ button at the top of the screen to come back to
this help file.
@click_me!click me@
they indicate that the text is to be highlighted (printed in blue). Any amount of text
can be highlighted, but the highlight command must be issued for each line. In other
words, the ^ cannot appear in one line and be closed in another.
The last component is the hypertext link which StataQuest highlights in green.
To place a hypertext link into the help file we use the syntax:
where help file is the name of the StataQuest help file and the Hypertext Link
Description is the corresponding text that StataQuest will display. So, for exam-
ple, @click me!click me@ will display in green click me and will display the help file
click me.hlp when and if the user clicks on that link. Note that the mouse cursor
changes from an arrow to a hand when a hypertext link is available. Move the arrow
over some green text to see this effect.
Two final notes: to display a @ or a ^ in a StataQuest help file, use the syntax
@@ and ^^ respectively. Also, the hypertext description that follows the ! above is
optional. If it is not present, then StataQuest will simply print the name of the help
file as the hyperlinked text.
19
5 Graphics
StataQuest can generate a rich variety of graphs and plots. In this section, we will
describe how to extend StataQuest’s graphics commands. We will assume the reader
is familiar with the StataQuest graph command which generates scatter plots, line
plots, histograms, bar plots, etc. A basic understanding of Stata programming will
be necessary to continue since we will concentrate on demonstrating the graphics
commands and not on the Stata programming language itself.
program define sq
gph open
gph close
end
sq /* run the program */
20
Figure 4: Some graphics drawn using low level StataQuest gph commands.
program define g1
version 4.0
preserve /* preserve current data */
clear /* clear data */
21
* gph open, saving(c:\ado\test, replace) /* alternative open */
Stata programmers will be familiar with the basic layout of this ADO program. We
define the ADO function g1 and state explicitly that version 4.0 of Stata is needed to
run this ADO file. (Remember that StataQuest is based on version 4.0 of Stata). The
preserve command preserves the current dataset in memory so that we can restore
it with restore at the end of the ADO program. The clear command clears the
current dataset from memory so that we are free to create our own. We bracket our
graphics commands with gph open and gph close. The gph close does not destroy
the graphics window itself. Note the commented out gph open command preceded by
a ‘*’. This alternate version of gph open will save the graphics window into a Stata
graphics (.gph format) file instead of displaying it on the screen. StataQuest will not
overwrite an existing .gph file unless the replace option is specified.
StataQuest’s low level graphics coordinate system defines (0, 0) to be the upper
left–hand corner and (23000, 32000) to be the lower right–hand corner. Furthermore,
the coordinate grid is always specified in (y, x) order. Thus, the StataQuest graphics
window is always 32000 units wide and 23000 units high. It is on this grid that all
graphics commands operate. Interactively resizing the graphics windows has no effect
on the underlying coordinate grid. Working on a 23000 by 32000 grid can be tedious
at times when trying to position text or graphics on the window. The commented out
grid() function above draws a grid—with user defined grid element size and pen—on
the graphics window. (In our example the grid size is 2000 by 1000 and the pen is
8.) The grid function is not part of StataQuest, but writing grid() is simple (see
Figure 5).
The lines are drawn using line style 1 or pen 1 as StataQuest calls it. Note that all
9 pens can be defined by the StataQuest user to be of any color or thickness desired
(pulldown menu Prefs-Graph-Colors for colors and Prefs-Graph-Line Thickness for
pen thickness). Furthermore, there is no gph command to allow the programmer
22
program define grid /* program to draw a grid */
version 4.0
to set the color or line style of a pen. The implication of this programming design
(having to do with programming plotters) should be obvious: there is no way the
programmer can know what a given pen style will actually draw on the screen! The
best way to approach this problem is to assume StataQuest defaults when creating
graphics. The default line width is 1 unit. The default colors for the 9 pens are:
StataQuest also defines a background color which is black by default. The background
color is treated in a special manner in StataQuest. Even though the default is black,
when printing, the background color is not printed unless the user specifies otherwise
(pulldown menu: Preferences-Graph).
23
...
gph font 2000 500
gph text 15000 29000 0 1 Arc’s
gph font 1000 500
...
gph text 14500 7500 0 0 "Point Styles"
The gph font command is a misnomer since only one font exists in StataQuest graph-
ics. The font command changes the size of the font to any arbitrary size. In our first
gph font command, we set the font size to 1000 units vertically and 500 units hori-
zontally.
Text is displayed at coordinates (y,x) either horizontally or vertically (argument
#3 is 0 or 1 for horizontal and vertical text respectively). The text can be centered
on those coordinates as well as left– or right–justified (argument #4 is 0, −1 or 1 for
center–, left– or right–justified respectively). The text string to be displayed should
not be in quotes, unless there are spaces in the text string. Adding quotes to a text
string without spaces will result in the quotes being printed.
We display the graph generated by the above code segment in the the upper left
quadrant of Figure 4. The vline and vtext commands operate on StataQuest vari-
ables. In our example, we begin by setting the data set size to 28 data points and
by initializing the seed of the random number generator to assure identical results
every time the graph is drawn. We continue by generating 28 points in our x and
y variables columns. The x variable simply steps from 1500 to 15000, while the y
variable generates 28 random normals scaled by 2000 and shifted by 8000. Finally,
24
we generate the variable z which contains the strings “1” through “28”. We draw the
connected line segments using gph vline and plot the text strings at the data points
using gph vtext.
There are two other commands to draw multiple lines in StataQuest. They are
gph vseg and gph vpoly. The vseg command draws unconnected line segments. The
syntax is:
where the beginning of a segment is located at (yvar1,xvar1), and the end of the line
segment is a (yvar2,xvar2). The gph poly command has the following syntax:
which will draw multiple p-sided unfilled polygons or partial polygons. Each poly-
gon is defined by the points (y1,x1), (y2,x2),..., (yp,xp). For the polygon to be
complete, the last data point needs to be the same as the first.
5.6 Points
There are six plotting symbols available in StataQuest. They are shown in the lower
left corner of Figure 4. We displayed them used the following code:
local style = 1
while ‘style’ <= 6 {
local offset = 1000 + ‘style’ * 2000
gph point 17000 ‘offset’ 1000 ‘style’
gph text 20000 ‘offset’ 0 0 ‘style’
local style = ‘style’ + 1
}
The gph point command will plot the desired plotting symbol or point at location
(y, x), The third argument is the size of the symbol, and the fourth argument is the
symbol type to be plotted. Figure 4 displays the 6 symbols available. In our example,
all 6 symbols are drawn in the same size of 1000 units. When drawn in default size,
the last 3 symbols are about 50% smaller than the first three.
Like gph vline, multiple points can be plotted using the command gph vpoint.
The syntax is:
where the points are defined by the variables (yvar,xvar). The size of the points are
contained in the variable svar and the point styles are in pvar.
25
5.7 Boxes
Boxes can be generated in StataQuest with the gph box command:
The result is displayed in the upper right quadrant of Figure 4. The first gph box
command above draws a box from the upper left corner at (1000, 20000) to the lower
right corner at (10000, 30000). Our five boxes above get progressively smaller and
are each contained within the previous box. The first box is shaded in style #0. The
shading is darkened with each box until we reach style #4 (which is black). The pen
color determines the shading colors for all styles except #4.
The last and smallest box above is shaded black and thus it will appear as if it is
the same color as the background color, assuming standard StataQuest colors are in
effect. It is important to remember, however, that the box is actually black and not
the background color. If the above boxes are printed they will appear progressively
darker, and the last box will print as black. The implication here is that drawing
a box shaded black is not the correct method to clear a section of the graph for
further drawing; since even though it might look acceptable on the screen, it will
print incorrectly. To clear a section, we need to explicitly use the gph clear command.
This will clear a rectangular area of the graph by setting it to the background color,
allowing further graphics to be drawn on and about the cleared area. If this seems
confusing, we suggest that the background colors in StataQuest be set to a different
color than black. This will make the difference between gph box ... 4 and gph clear
obvious. Remember that the background color will, by default, not print on the
printer. Stata 4.0 for DOS users should note that gph clear is not implemented in the
Stata programs gphpen and gphdot which convert Stata GPH file to device–dependent
output (PostScript, HP PCL, etc.). These two utilities do not come supplied with
StataQuest; they are only available in the full version of Stata.
5.8 Arcs
StataQuest can draw pie shaped wedges using the gph arc command:
local arc = 0
local line = 0
local radius = 2000
while ‘arc’ <= 32767 {
26
local arc2 = ‘arc’ + 2048
gph arc 17000 24000 ‘radius’ ‘arc’ ‘arc2’ ‘line’
local arc = ‘arc2’
if ‘line’ == 3 {
local line = 0
local radius = ‘radius’ + 1000 }
else {
local line = ‘line’ + 1 }
}
gph arc 17000 24000 1000 0 24576 4 /* center pie wedge */
The result of the above code segment can be observed in the lower right quadrant
of Figure 4. The gph arc command draws an arc of (almost) any size and radius.
Examine the last gph arc command in the code segment above. There we draw a
pie–shaped wedge whose center is at (17000, 24000) with a radius of 2000 units. The
center pie wedge is drawn from angle 0 to 24576 using shading style #4. The shading
styles are the same as they are for boxes. Note the strange angle designation: 0
to 24576. The angle argument in gph arc is measured clockwise from 0 to 32767.
Thus, every 45 degrees would be equivalent to 4096 units and every degree is 91.0222
units. The gph arc command has a strange quirk. The angle specified must be
between 0 and 32767 inclusive. It is not possible to draw an entire circle using the
arc specification of 0 to 32768. Below is a table to help convert degrees to StataQuest
arc coordinates:
0 0 90 8192 180 16384 270 24576
15◦ 1365 105◦ 9557 195◦ 17749 285◦ 25941
30◦ 2731 120◦ 10923 210◦ 19115 300◦ 27307
45 4096 135 12288 225 20480 315 28672
60◦ 5461 150◦ 13653 240◦ 21845 330◦ 30037
75◦ 6827 165◦ 15019 255◦ 23211 345◦ 31403
5.9 Snooze
The snooze command is not really graphics related, but since it is often used when
drawing graphics, we present it here. The snooze command will simply sleep the
computer for the indicated number of milliseconds. So, for example,
snooze 2500
will suspend program execution for 2.5 seconds. This command is often used to create
artificial delays in program presentations.
27
Nonetheless, most of the time, a user does not desire to start from scratch when
designing a data plot of some kind. For example, StataQuest can generate a his-
togram using the graph command. If we desired to create a new histogram that
adds some new features to the standard histogram, we would want to build on the
StataQuest histogram graphing routine instead of starting from scratch. Fortunately,
StataQuest does provide for this ability.
Technically the process of adding gph commands to an existing graph is simple.
The following command sequence within a program will do this:
program define g1
...
gph open /* We open the graphics window. */
graph ... /* We graph data to that windows. */
...
gph ... /* We add our own graphics elements. */
gph ...
...
gph close /* We close the graphics window. */
...
end
Note the following few items. First, we must open the graphics windows with gph
open prior to the first graph command because gph open clears the graphics window.
The graph command draws the axis and plots the data. Subsequent gph commands
can be used to annotate or otherwise modify the output of the graph command.
Finally, we close the graphics window with gph close. We must do all this within a
program since we cannot call gph commands from the command line directly.
The StataQuest graph command plots the data on the same device–independent
grid as the gph commands. This presents the following problem: if we desire to
write a program to add data dependent features to the output of a StataQuest graph
command, we need to be able to map the data to the device–independent grid. For
example, examine the output of Figure 6 where we visually summarize a variable
by plotting it on the x axis and drawing lines to indicate the mean and standard
deviation of the data. We also highlight the 5 largest and smallest data points by
circling them. To write such a program we need to know how to map a given data
point (x, y) to the device–independent grid so that we can circle it.
StataQuest solves this problem by placing the conversion factors it uses in the
result() vector and the plotting area in two global S macros. Stata programmers
will be familiar with the result() vector; it is a vector returned by many StataQuest
commands containing the results of the just completed Stata command. In the case
of graph, the result() vector contains the following 8 elements:
28
Figure 6: Sample output of a Visual Summarize Program
These values can be used to map any data point (x, y) to the device independent
graphics window (xd , yd ) using the formulas:
xd = ax × x + bx
yd = ay × y + by
29
size graph that will always be a height of 923 and width of 444). The rotate value
is either 0 or 1 (a 1 would indicate the string is rotated by 90 degrees). Since the
contents of the two S macros are strings, we will need to parse the strings before we
can access individual elements in the macro:
This will display the graph on which Figure 6 is based. Note the result vector: it tells
us that the y axis of the graph is plotted from −3.42 to 2.62, while the x axis ranges
from 1 to 100. If we wanted to circle observation number 56 where y is 2.622218 (the
maximum), we would calculate its location on the device independent grid by:
xd = ax × x + bx = 257.778 × 56 + 5167.22 = 19602.78
yd = ay × y + by = −3095.154 × 2.622 + 9171.17 = 1055.68
Thus, if done within a program define, the command:
would circle the 56th data point on the graph. Note the contents of the S 2 macro:
S_2: 1055,5425,19771,30945,443,213,0
We can use this information to clear the plotting area of the graph without disturbing
the axis:
30
6 Complete Example Program
In Appendix A.3 we present the StataQuest ADO program vsum that plots a univariate
data set and annotates it a little bit by indicating the location of the mean, standard
deviation (relative to the mean) and by highlighting the five smallest and largest data
points. We use this program as a basis for a simple StataQuest lab. We want to use
vsum to repeatedly display randomly generated data from a few different distributions.
To do this, we write a dialog box that prompts the user for the distribution of the data
and the sample size using simple list boxes. We allow the user (with check boxes)
to choose if they want to annotate the display with the mean, standard deviation
and extreme data points. Appendix A.2 lists the ADO dialog box program simv that
serves as our front end to vsum. For completeness, we also have the small help file
listed in Appendix A.4. Figure 7 displays the sample output of our program.
31
Our complete program consists of the three files listed in Appendices A.2, A.3
and A.4. They should be saved in files called vsum.ado, simv.ado and simv.hlp re-
spectively. All three files can be placed in the C:\ADO directory which is intended to
hold user supplied ADO files. Once the files are in place, to execute the program we
type simv from the StataQuest command line. We can repeatedly generate new plots
by pressing the Run button.
32
A Appendix: Example Programs
A.1 Sample StataQuest Graphics
/*
* Program to demonstrate various StataQuest graphics commands.
*/
program define g1
version 4.0 /* StataQuest is based on Stata Version 4.0 */
/*** We save the current data set and clear all data from memory. */
preserve
clear
/*** We open the graphics window. The alternate ’gph open’ command
will create a Stata type GPH graphics file ’g1.gph’ */
gph open
* gph open, saving(c:\ado\g1, replace)
gph pen 1
gph line 11600 0 11600 32000 0
gph line 0 16000 23000 16000 0
33
gph box 2000 21000 9000 29000 1
gph box 3000 22000 8000 28000 2
gph box 4000 23000 7000 27000 3
gph box 5000 24000 6000 26000 4
gph pen 1
gph text 3250 27500 0 0 clear
gph pen 2
gph font 2000 500
gph text 15000 29000 0 1 Arc’s
gph font 1000 500
local arc = 0
local line = 0
local radius = 2000
while ‘arc’ <= 32767 {
local arc2 = ‘arc’ + 2048
gph arc 17000 24000 ‘radius’ ‘arc’ ‘arc2’ ‘line’
local arc = ‘arc2’
if ‘line’ == 3 {
local line = 0
local radius = ‘radius’ + 1000
}
else {
local line = ‘line’ + 1
}
}
gph arc 17000 24000 1000 0 24576 4 /* center pie wedge */
gph pen 3
gph text 14500 7500 0 0 Point Styles
local style = 1
while ‘style’ <= 3 {
local offset = 1000 + ‘style’ * 2000
gph point 17000 ‘offset’ 1000 ‘style’
gph text 20000 ‘offset’ 0 0 ‘style’
local style = ‘style’ + 1
34
}
set obs 28
set seed 1234
gen x = _n*500+1000
gen y = invnorm(uniform())*2000+8000
gen str3 z = string(_n)
gph pen 7
gph line 8000 500 8000 15500 0
gph pen 5
gph vline y x
gph pen 6
gph font 600 300
gph vtext y x z
/*** We close the graph. If we are using the ’gph open, saving...’
command above, the ’gph close’ statement will dump the graphics
to the StataQuest file ’c:\ado\g1.gph’. */
gph close
restore
end
35
A.2 StataQuest Dialog Box Program: simv
/*
* A simple interface to vsum that generates random data from a
* user selectable distribution and displays it using vsum.ado.
* The first program ’simv’ sets up the dialog box, the second
* program ’runv’ parses the output of ’simv’, generates the
* data and calls ’vsum’.
*/
/*** Introduction */
36
global mbut3 "whelp simv"
restore
end
local dist=0
quietly set obs $sizepick
if "$distpick" == "Normal" {
generate y = invnorm(uniform())
local dist=1
}
if "$distpick" == "Uniform" {
generate y = uniform()
local dist=1
}
if "$distpick" == "Exponential" {
generate y = -log(uniform())
local dist=1
}
if "$distpick" == "Gamma" {
37
generate y = invgammap(1,uniform())
local dist=1
}
if "$distpick" == "Cauchy" {
generate y = invt(1,uniform())
local dist=1
}
if ‘dist’ == 0 {
sstopbox stop "Unknown distribution."
exit
}
/*** First we preserve the current data set so that we can restore
it. Note that we do not want to clear the data since we will be
plotting it! */
38
preserve
local y = "‘1’"
if "‘draw1’" == "" {
local draw1 = 1
}
if "‘draw2’" == "" {
local draw2 = 1
}
if "‘draw3’" == "" {
local draw3 = 1
}
gen x = _n
gph open
graph ‘y’ x
39
local maxx = _result(4)
local ay = _result(5)
local by = _result(6)
local ax = _result(7)
local bx = _result(8)
summarize ‘y’
/*** We parse the S_2 macro which holds the plotting area values
that ’graph’ uses. */
/*** We draw a line for the mean value and annotate it! */
gph pen 8
if ‘draw1’ != 0 {
local yg = ‘ay’*‘mean’ + ‘by’
/*** We draw two lines for the std. deviation and annotate it! */
if ‘draw2’ ! = 0 {
local yg1 = ‘ay’*(‘mean’+‘stdd’) + ‘by’
local yg2 = ‘ay’*(‘mean’-‘stdd’) + ‘by’
40
/*** Circle the 5 largest and smallest points. */
if ‘draw3’ != 0 {
sort y
local i = 1
while ‘i’ <= 5 {
local yg = ‘ay’*y[‘i’] + ‘by’
local xg = ‘ax’*x[‘i’] + ‘bx’
gph point ‘yg’ ‘xg’ 500 1
local yg = ‘ay’*y[_N+1-‘i’] + ‘by’
local xg = ‘ax’*x[_N+1-‘i’] + ‘bx’
gph point ‘yg’ ‘xg’ 500 1
local i = ‘i’ + 1
}
}
gph close
restore
end
SIMV is a front end for the VSUM program. It allows the user to
select a distribution from which to simulate random data.
41