TCL Basics
TCL Basics
TCL Basics
Tcl Basics
P A R T I
Tcl Basics
Part I introduces the basics of Tcl. Everyone should read Chapter 1, which
describes the fundamental properties of the language. Tcl is really quite simple,
so beginners can pick it up quickly. The experienced programmer should review
Chapter 1 to eliminate any misconceptions that come from using other lan-
guages.
Chapter 2 is a short introduction to running Tcl and Tk on UNIX, Windows,
and Macintosh systems. You may want to look at this chapter first so you can try
out the examples as you read Chapter 1.
Chapter 3 presents a sample application, a CGI script, that implements a
guestbook for a Web site. The example uses several facilities that are described
in detail in later chapters. The goal is to provide a working example that illus-
trates the power of Tcl.
The rest of Part covers basic programming with Tcl. Simple string pro-
cessing is covered in Chapter 4. Tcl lists, which share the syntax rules of Tcl com-
mands, are explained in Chapter 5. Control structure like loops and if
statements are described in Chapter 6. Chapter 7 describes Tcl procedures,
which are new commands that you write in Tcl. Chapter 8 discusses Tcl arrays.
Arrays are the most flexible and useful data structure in Tcl. Chapter 9 describes
file I/O and running other programs. These facilities let you build Tcl scripts that
glue together other programs and process data in files.
After reading Part I you will know enough Tcl to read and understand other
Tcl programs, and to write simple programs yourself.
1
Blank page 2
1
I. Tcl Basics
C H A P T E R
Tcl Fundamentals 1
This chapter describes the basic syntax rules for the Tcl scripting language. It
describes the basic mechanisms used by the Tcl interpreter: substitution
and grouping. It touches lightly on the following Tcl commands: puts,
format, set, expr, string, while, incr, and proc.
Tcl Commands
Tcl stands for Tool Command Language. A command does something for you, like
output a string, compute a math expression, or display a widget on the screen.
Tcl casts everything into the mold of a command, even programming constructs
3
4 Tcl Fundamentals Chap. 1
like variable assignment and procedure definition. Tcl adds a tiny amount of
syntax needed to properly invoke commands, and then it leaves all the hard work
up to the command implementation.
The basic syntax for a Tcl command is:
command arg1 arg2 arg3 ...
The command is either the name of a built-in command or a Tcl procedure.
White space (i.e., spaces or tabs) is used to separate the command name and its
arguments, and a newline (i.e., the end of line character) or semicolon is used to
terminate a command. Tcl does not interpret the arguments to the commands
except to perform grouping, which allows multiple words in one argument, and
substitution, which is used with programming variables and nested command
calls. The behavior of the Tcl command processor can be summarized in three
basic steps:
• Argument grouping.
• Value substitution of nested commands, variables, and backslash escapes.
• Command invocation. It is up to the command to interpret its arguments.
This model is described in detail in this Chapter.
Hello, World!
Example 1–1 The “Hello, World!” example.
In this example, the command is puts, which takes two arguments: an I/O
stream identifier and a string. puts writes the string to the I/O stream along
with a trailing newline character. There are two points to emphasize:
I. Tcl Basics
Variables
The set command is used to assign a value to a variable. It takes two arguments:
The first is the name of the variable, and the second is the value. Variable names
can be any length, and case is significant. In fact, you can use any character in a
variable name.
It is not necessary to declare Tcl variables before you use them.
The interpreter will create the variable when it is first assigned a value.
The value of a variable is obtained later with the dollar-sign syntax, illustrated
in Example 1–2:
set var 5
=> 5
set b $var
=> 5
The second set command assigns to variable b the value of variable var.
The use of the dollar sign is our first example of substitution. You can imagine
that the second set command gets rewritten by substituting the value of var for
$var to obtain a new command.
set b 5
The actual implementation of substitution is more efficient, which is important
when the value is large.
Command Substitution
The second form of substitution is command substitution. A nested command is
delimited by square brackets, [ ]. The Tcl interpreter takes everything between
the brackets and evaluates it as a command. It rewrites the outer command by
replacing the square brackets and everything between them with the result of
the nested command. This is similar to the use of backquotes in other shells,
except that it has the additional advantage of supporting arbitrary nesting of
commands.
Math Expressions
The Tcl interpreter itself does not evaluate math expressions. Tcl just does
grouping, substitutions and command invocations. The expr command is used to
parse and evaluate math expressions.
expr 7.2 / 4
=> 1.8
The math syntax supported by expr is the same as the C expression syntax.
The expr command deals with integer, floating point, and boolean values. Logical
operations return either 0 (false) or 1 (true). Integer values are promoted to float-
ing point values as needed. Octal values are indicated by a leading zero (e.g., 033
is 27 decimal). Hexadecimal values are indicated by a leading 0x. Scientific nota-
tion for floating point numbers is supported. A summary of the operator prece-
dence is given on page 20.
You can include variable references and nested commands in math expres-
sions. The following example uses expr to add the value of x to the length of the
string foobar. As a result of the innermost command substitution, the expr com-
mand sees 6 + 7, and len gets the value 13:
set x 7
set len [expr [string length foobar] + $x]
=> 13
I. Tcl Basics
The implementation of expr is careful to preserve accurate numeric values
and avoid conversions between numbers and strings. However, you can make
expr operate more efficiently by grouping the entire expression in curly braces.
The explanation has to do with the byte code compiler that Tcl uses internally,
and its effects are explained in more detail on page 15. For now, you should be
aware that these expressions are all valid and run a bit faster than the examples
shown above:
expr {7.2 / 4}
set len [expr {[string length foobar] + $x}]
set pi [expr {2*asin(1.0)}]
Backslash Substitution
The final type of substitution done by the Tcl interpreter is backslash substitu-
tion. This is used to quote characters that have special meaning to the inter-
preter. For example, you can specify a literal dollar sign, brace, or bracket by
quoting it with a backslash. As a rule, however, if you find yourself using lots of
backslashes, there is probably a simpler way to achieve the effect you are striv-
ing for. In particular, the list command described on page 61 will do quoting for
you automatically. In Example 1–8 backslash is used to get a literal $:
There are two fine points to escaping newlines. First, if you are grouping an
argument as described in the next section, then you do not need to escape new-
lines; the newlines are automatically part of the group and do not terminate the
command. Second, a backslash as the last character in a line is converted into a
space, and all the white space at the beginning of the next line is replaced by this
substitution. In other words, the backslash-newline sequence also consumes all
the leading white space on the next line.
set s Hello
=> Hello
puts stdout "The length of $s is [string length $s]."
=> The length of Hello is 5.
puts stdout {The length of $s is [string length $s].}
=> The length of $s is [string length $s].
In the second command of Example 1–10, the Tcl interpreter does variable
and command substitution on the second argument to puts. In the third com-
mand, substitutions are prevented, so the string is printed as is.
In practice, grouping with curly braces is used when substitutions on the
argument must be delayed until a later time (or never done at all). Examples
include loops, conditional statements, and procedure declarations. Double quotes
are useful in simple cases like the puts command previously shown.
Another common use of quotes is with the format command. This is similar
to the C printf function. The first argument to format is a format specifier that
often includes special characters like newlines, tabs, and spaces. The easiest way
to specify these characters is with backslash sequences (e.g., \n for newline and
\t for tab). The backslashes must be substituted before the format command is
Grouping with Braces and Double Quotes 9
I. Tcl Basics
called, so you need to use quotes to group the format specifier.
puts [format "Item: %s\t%5.3f" $name $value]
Here format is used to align a name and a value with a tab. The %s and
%5.3f indicate how the remaining arguments to format are to be formatted. Note
that the trailing \n usually found in a C printf call is not needed because puts
provides one for us. For more information about the format command, see page
52.
set x 7; set y 9
puts stdout $x+$y=[expr $x + $y]
=> 7+9=16
7+9=
When the left bracket is encountered, the interpreter calls itself recursively
to evaluate the nested command. Again, the $x and $y are substituted before
calling expr. Finally, the result of expr is substituted for everything from the left
bracket to the right bracket. The puts command gets the following as its second
argument:
7+9=16
Grouping before substitution.
The point of this example is that the grouping decision about puts’s second
argument is made before the command substitution is done. Even if the result of
the nested command contained spaces or other special characters, they would be
ignored for the purposes of grouping the arguments to the outer command.
Grouping and variable substitution interact the same as grouping and command
substitution. Spaces or special characters in variable values do not affect group-
ing decisions because these decisions are made before the variable values are
substituted.
If you want the output to look nicer in the example, with spaces around the
+ and =, then you must use double quotes to explicitly group the argument to
puts:
puts stdout "$x + $y = [expr $x + $y]"
The double quotes are used for grouping in this case to allow the variable and
command substitution on the argument to puts.
I. Tcl Basics
Procedures
Tcl uses the proc command to define procedures. Once defined, a Tcl procedure
is used just like any of the other built-in Tcl commands. The basic syntax to
define a procedure is:
proc name arglist body
The first argument is the name of the procedure being defined. The second
argument is a list of parameters to the procedure. The third argument is a com-
mand body that is one or more Tcl commands.
The procedure name is case sensitive, and in fact it can contain any charac-
ters. Procedure names and variable names do not conflict with each other. As a
convention, this book begins procedure names with uppercase letters and it
begins variable names with lowercase letters. Good programming style is impor-
tant as your Tcl scripts get larger. Tcl coding style is discussed in Chapter 12.
proc Diag {a b} {
set c [expr sqrt($a * $a + $b * $b)]
return $c
}
puts "The diagonal of a 3, 4 right triangle is [Diag 3 4]"
=> The diagonal of a 3, 4 right triangle is 5.0
The Diag procedure defined in the example computes the length of the diag-
onal side of a right triangle given the lengths of the other two sides. The sqrt
function is one of many math functions supported by the expr command. The
variable c is local to the procedure; it is defined only during execution of Diag.
Variable scope is discussed further in Chapter 7. It is not really necessary to use
the variable c in this example. The procedure can also be written as:
proc Diag {a b} {
return [expr sqrt($a * $a + $b * $b)]
}
The return command is used to return the result of the procedure. The
return command is optional in this example because the Tcl interpreter returns
the value of the last command in the body as the value of the procedure. So, the
procedure could be reduced to:
proc Diag {a b} {
expr sqrt($a * $a + $b * $b)
}
Note the stylized use of curly braces in the example. The curly brace at the
end of the first line starts the third argument to proc, which is the command
body. In this case, the Tcl interpreter sees the opening left brace, causing it to
ignore newline characters and scan the text until a matching right brace is
found. Double quotes have the same property. They group characters, including
newlines, until another double quote is found. The result of the grouping is that
12 Tcl Fundamentals Chap. 1
the third argument to proc is a sequence of commands. When they are evaluated
later, the embedded newlines will terminate each command.
The other crucial effect of the curly braces around the procedure body is to
delay any substitutions in the body until the time the procedure is called. For
example, the variables a, b, and c are not defined until the procedure is called, so
we do not want to do variable substitution at the time Diag is defined.
The proc command supports additional features such as having variable
numbers of arguments and default values for arguments. These are described in
detail in Chapter 7.
A Factorial Example
To reinforce what we have learned so far, below is a longer example that uses a
while loop to compute the factorial function:
The semicolon is used on the first line to remind you that it is a command
terminator just like the newline character. The while loop is used to multiply all
the numbers from one up to the value of x. The first argument to while is a bool-
ean expression, and its second argument is a command body to execute. The
while command and other control structures are described in Chapter 6.
The same math expression evaluator used by the expr command is used by
while to evaluate the boolean expression. There is no need to explicitly use the
expr command in the first argument to while, even if you have a much more
complex expression.
The loop body and the procedure body are grouped with curly braces in the
same way. The opening curly brace must be on the same line as proc and while.
If you like to put opening curly braces on the line after a while or if statement,
you must escape the newline with a backslash:
while {$i < $x} \
{
set product ...
}
Always group expressions and command bodies with curly braces.
More about Variables 13
I. Tcl Basics
Curly braces around the boolean expression are crucial because they delay
variable substitution until the while command implementation tests the expres-
sion. The following example is an infinite loop:
set i 1; while $i<=10 {incr i}
The loop will run indefinitely.* The reason is that the Tcl interpreter will
substitute for $i before while is called, so while gets a constant expression 1<=10
that will always be true. You can avoid these kinds of errors by adopting a consis-
tent coding style that groups expressions with curly braces:
set i 1; while {$i<=10} {incr i}
The incr command is used to increment the value of the loop variable i.
This is a handy command that saves us from the longer command:
set i [expr $i + 1]
The incr command can take an additional argument, a positive or negative
integer by which to change the value of the variable. Using this form, it is possi-
ble to eliminate the loop variable i and just modify the parameter x. The loop
body can be written like this:
while {$x > 1} {
set product [expr $product * $x]
incr x -1
}
Example 1–14 shows factorial again, this time using a recursive definition.
A recursive function is one that calls itself to complete its work. Each recursive
call decrements x by one, and when x is one, then the recursion stops.
*
Ironically, Tcl 8.0 introduced a byte-code compiler, and the first releases of Tcl 8.0 had a bug in the com-
piler that caused this loop to terminate! This bug is fixed in the 8.0.5 patch release.
14 Tcl Fundamentals Chap. 1
This is a somewhat tricky example. In the last command, $name gets substi-
tuted with var. Then, the set command returns the value of var, which is the
value of var. Nested set commands provide another way to achieve a level of
indirection. The last set command above can be written as follows:
set [set name]
=> the value of var
Using a variable to store the name of another variable may seem overly
complex. However, there are some times when it is very useful. There is even a
special command, upvar, that makes this sort of trick easier. The upvar command
is described in detail in Chapter 7.
I. Tcl Basics
The unset Command
You can delete a variable with the unset command:
unset varName varName2 ...
Any number of variable names can be passed to the unset command. How-
ever, unset will raise an error if a variable is not already defined.
Example 7–6 on page 86 implements a new version of incr which handles this
case.
expr 1 / 3
=> 0
expr 1 / 3.0
=> 0.333333333333
set tcl_precision 17
=> 17
expr 1 / 3.0
# The trailing 1 is the IEEE rounding digit
=> 0.33333333333333331
16 Tcl Fundamentals Chap. 1
Comments
Tcl uses the pound character, #, for comments. Unlike in many other languages,
the # must occur at the beginning of a command. A # that occurs elsewhere is not
treated specially. An easy trick to append a comment to the end of a command is
Substitution and Grouping Summary 17
I. Tcl Basics
to precede the # with a semicolon to terminate the previous command:
# Here are some parameters
set rate 7.0 ;# The interest rate
set months 60 ;# The loan term
One subtle effect to watch for is that a backslash effectively continues a
comment line onto the next line of the script. In addition, a semicolon inside a
comment is not significant. Only a newline terminates comments:
# Here is the start of a Tcl comment \
and some more of it; still in the comment
The behavior of a backslash in comments is pretty obscure, but it can be
exploited as shown in Example 2–3 on page 27.
A surprising property of Tcl comments is that curly braces inside comments
are still counted for the purposes of finding matching brackets. I think the moti-
vation for this mis-feature was to keep the original Tcl parser simpler. However,
it means that the following will not work as expected to comment out an alter-
nate version of an if expression:
# if {boolean expression1} {
if {boolean expression2} {
some commands
}
The previous sequence results in an extra left curly brace, and probably a
complaint about a missing close brace at the end of your script! A technique I use
to comment out large chunks of code is to put the code inside an if block that
will never execute:
if {0} {
unused code here
}
Fine Points
• A common error is to forget a space between arguments when grouping with
braces or quotes. This is because white space is used as the separator, while
the braces or quotes only provide grouping. If you forget the space, you will
get syntax errors about unexpected characters after the closing brace or
quote. The following is an error because of the missing space between } and
{:
if {$x > 1}{puts "x = $x"}
• A double quote is only used for grouping when it comes after white space.
This means you can include a double quote in the middle of a group without
quoting it with a backslash. This requires that curly braces or white space
delimit the group. I do not recommend using this obscure feature, but this
is what it looks like:
set silly a"b
• When double quotes are used for grouping, the special effect of curly braces
is turned off. Substitutions occur everywhere inside a group formed with
Fine Points 19
I. Tcl Basics
double quotes. In the next command, the variables are still substituted:
set x xvalue
set y "foo {$x} bar"
=> foo {xvalue} bar
• When double quotes are used for grouping and a nested command is encoun-
tered, the nested command can use double quotes for grouping, too.
puts "results [format "%f %f" $x $y]"
• Spaces are not required around the square brackets used for command sub-
stitution. For the purposes of grouping, the interpreter considers everything
between the square brackets as part of the current group. The following
sets x to the concatenation of two command results because there is no
space between ] and [.
set x [cmd1][cmd2]
• Newlines and semicolons are ignored when grouping with braces or double
quotes. They get included in the group of characters just like all the others.
The following sets x to a string that contains newlines:
set x "This is line one.
This is line two.
This is line three."
• During command substitution, newlines and semicolons are significant as
command terminators. If you have a long command that is nested in square
brackets, put a backslash before the newline if you want to continue the
command on another line. This was illustrated in Example 1–9 on page 8.
• A dollar sign followed by something other than a letter, digit, underscore, or
left parenthesis is treated as a literal dollar sign. The following sets x to the
single character $.
set x $
20 Tcl Fundamentals Chap. 1
Reference
Backslash Sequences
Arithmetic Operators
I. Tcl Basics
Built-in Math Functions
acos(x) Arccosine of x.
asin(x) Arcsine of x.
atan(x) Arctangent of x.
atan2(y,x) Rectangular (x,y) to polar (r,th). atan2 gives th.
ceil(x) Least integral value greater than or equal to x.
cos(x) Cosine of x.
cosh(x) Hyperbolic cosine of x.
exp(x) Exponential, ex.
floor(x) Greatest integral value less than or equal to x.
fmod(x,y) Floating point remainder of x/y.
hypot(x,y) Returns sqrt(x*x + y*y). r part of polar coordinates.
log(x) Natural log of x.
log10(x) Log base 10 of x.
pow(x,y) x to the y power, xy.
sin(x) Sine of x.
sinh(x) Hyperbolic sine of x.
sqrt(x) Square root of x.
tan(x) Tangent of x.
tanh(x) Hyperbolic tangent of x.
abs(x) Absolute value of x.
double(x) Promote x to floating point.
int(x) Truncate x to an integer.
round(x) Round x to an integer.
rand() Return a random floating point value between 0.0 and 1.0.
srand(x) Set the seed for the random number generator to the integer x.
I. Tcl Basics
Table 1–4 Built-in Tcl commands. (Continued)
I. Tcl Basics
C H A P T E R
Getting Started 2
This chapter explains how to run Tcl and Tk on different operating system
platforms: UNIX, Windows, and Macintosh. Tcl commands discussed
are: source, console and info.
25
26 Getting Started Chap. 2
#!/usr/local/bin/tclsh
puts stdout {Hello, World!}
#!/usr/local/bin/wish
button .hello -text Hello -command {puts "Hello, World!"}
pack .hello -padx 10 -pady 10
The actual pathnames for tclsh and wish may be different on your system.
If you type the pathname for the interpreter wrong, you receive a confusing
“command not found” error. You can find out the complete pathname of the Tcl
interpreter with the info nameofexecutable command. This is what appears on
my system:
info nameofexecutable
=> /home/welch/install/solaris/bin/tclsh8.2
Watch out for long pathnames.
Windows 95 Start Menu 27
I. Tcl Basics
On most UNIX systems, this special first line is limited to 32 characters,
including the #!. If the pathname is too long, you may end up with /bin/sh try-
ing to interpret your script, giving you syntax errors. You might try using a sym-
bolic link from a short name to the true, long name of the interpreter. However,
watch out for systems like Solaris in which the script interpreter cannot be a
symbolic link. Fortunately, Solaris doesn’t impose a 32-character limit on the
pathname, so you can just use a long pathname.
The next example shows a trick that works around the pathname length
limitation in all cases. The trick comes from a posting to comp.lang.tcl by
Kevin Kenny. It takes advantage of a difference between comments in Tcl and
the Bourne shell. Tcl comments are described on page 16. In the example, the
Bourne shell command that runs the Tcl interpreter is hidden in a comment as
far as Tcl is concerned, but it is visible to /bin/sh:
#!/bin/sh
# The backslash makes the next line a comment in Tcl \
exec /some/very/long/path/to/wish "$0" ${1+"$@"}
# ... Tcl script goes here ...
You do not even have to know the complete pathname of tclsh or wish to use
this trick. You can just do the following:
#!/bin/sh
# Run wish from the users PATH \
exec wish -f "$0" ${1+"$@"}
I. Tcl Basics
Command-Line Arguments
If you run a script from the command line, for example from a UNIX shell, you
can pass the script command-line arguments. You can also specify these argu-
ments in the shortcut command in Windows. For example, under UNIX you can
type this at a shell:
% myscript.tcl arg1 arg2 arg3
In Windows, you can have a shortcut that runs wish on your script and also
passes additional arguments:
"c:\Program Files\TCL82\wish.exe" c:\your\script.tcl arg1
The Tcl shells pass the command-line arguments to the script as the value
of the argv variable. The number of command-line arguments is given by the
argc variable. The name of the program, or script, is not part of argv nor is it
counted by argc. Instead, it is put into the argv0 variable. Table 2–2 lists all the
predefined variables in the Tcl shells. argv is a list, so you can use the lindex
command, which is described on page 59, to extract items from it:
set arg1 [lindex $argv 0]
The following script prints its arguments (foreach is described on page 73):
Predefined Variables
I. Tcl Basics
C H A P T E R
This chapter presents a simple Tcl program that computes a Web page. The
chapter provides a brief background to HTML and the CGI interface to
Web servers.
31
32 The Guestbook CGI Application Chap. 3
The Tcl scripts described in this chapter use commands and techniques that
are described in more detail in later chapters. The goal of the examples is to dem-
onstrate the power of Tcl without explaining every detail. If the examples in this
chapter raise questions, you can follow the references to examples in other chap-
ters that do go into more depth.
I. Tcl Basics
Table 3–1 HTML tags used in the examples. (Continued)
H1 - H6 HTML defines 6 heading levels: H1, H2, H3, H4, H5, H6.
P Start a new paragraph.
BR One blank line.
B Bold text.
I Italic text.
A Used for hypertext links.
IMG Specify an image.
DL Definition list.
DT Term clause in a definition list.
DD Definition clause in a definition list.
UL An unordered list.
LI A bulleted item within a list.
TABLE Create a table.
TR A table row.
TD A cell within a table row.
FORM Defines a data entry form.
INPUT A one-line entry field, checkbox, radio button, or submit button.
TEXTAREA A multiline text field.
The program computes a simple HTML page that has the current time.
Each time a user visits the page they will see the current time on the server. The
server that has the CGI program and the user viewing the page might be on dif-
ferent sides of the planet. The output of the program starts with a Content-Type
line that tells your Web browser what kind of data comes next. This is followed
by a blank line and then the contents of the page.
The clock command is used twice: once to get the current time in seconds,
and a second time to format the time into a nice looking string. The clock com-
mand is described in detail on page 173. Fortunately, there is no conflict between
the markup syntax used by HTML and the Tcl syntax for embedded commands,
so we can mix the two in the argument to the puts command. Double quotes are
used to group the argument to puts so that the clock commands will be exe-
cuted. When run, the output of the program will look like this:
Content-Type: text/html
This example is a bit sloppy in its use of HTML, but it should display prop-
erly in most Web browsers. Example 3–3 includes all the required tags for a
proper HTML document.
I. Tcl Basics
Example 3–3 The guestbook.cgi script.
#!/bin/sh
# guestbook.cgi
# Implement a simple guestbook page.
# The set of visitors is kept in a simple database.
# The newguest.cgi script will update the database.
# \
exec tclsh "$0" ${1+"$@"}
<HTML>
<HEAD>
<TITLE>$title</TITLE>
</HEAD>
<BODY $bodyparams>
<H1>$title</H1>"
}
The Cgi_Header procedure takes as arguments the title for the page and
some optional parameters for the HTML <Body> tag. The guestbook.cgi script
specifies black text on a white background to avoid the standard gray back-
ground of most browsers. The procedure definition uses the syntax for an
optional parameter, so you do not have to pass bodyparams to Cgi_Header.
Default values for procedure parameters are described on page 81.
The Cgi_Header procedure just contains a single puts command that gener-
ates the standard boilerplate that appears at the beginning of the output. Note
that several lines are grouped together with double quotes. Double quotes are
used so that the variable references mixed into the HTML are substituted prop-
erly.
The output begins with the CGI content-type information, a blank line, and
then the HTML. The HTML is divided into a head and a body part. The <TITLE>
tag goes in the head section of an HTML document. Finally, browsers display the
title in a different place than the rest of the page, so I always want to repeat the
title as a level-one heading (i.e., H1) in the body of the page.
I. Tcl Basics
if {![file exists $datafile]} {
If the database file does not exist, a different page is displayed to encourage
a registration. The page includes a hypertext link to a registration page. The
newguest.html page will be described in more detail later:
The P command generates the HTML for a paragraph break. This trivial
procedure saves us a few keystrokes:
proc P {} {
puts <P>
}
The Link command formats and returns the HTML for a hypertext link.
Instead of printing the HTML directly, it is returned, so you can include it in-line
with other text you are printing:
Content-Type: text/html
<HTML>
<HEAD>
<TITLE>Brent’s Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent’s Guestbook</H1>
<P>
No registered guests.
<P>
Be the first <A HREF="newguest.html">registered guest!</A>
</BODY>
</HTML>
If the database file exists, then the real work begins. We first generate a
link to the registration page, and a level-two header to separate that from the
guest list:
puts [Link Register newguest.html]
H2 Guests
38 The Guestbook CGI Application Chap. 3
The H2 procedure handles the detail of including the matching close tag:
proc H2 {string} {
puts "<H2>$string</H2>"
}
I. Tcl Basics
set markup [lindex $item 1]
We generate the HTML for the guestbook entry as a level-three header that
contains a hypertext link to the guest’s home page. We follow the link with any
HTML markup text that the guest has supplied to embellish his or her entry.
The H3 procedure is similar to the H2 procedure already shown, except it gener-
ates <H3> tags:
H3 [Link $name $homepage]
puts $markup
Sample Output
The last thing the script does is call Cgi_End to output the proper closing
tags. Example 3–7 shows the output of the guestbook.cgi script:
Content-Type: text/html
<HTML>
<HEAD>
<TITLE>Brent’s Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent’s Guestbook</H1>
<P>
The following folks have registered in my guestbook.
<P>
<A HREF="newguest.html">Register</A>
<H2>Guests</H2>
<H3><A HREF="http://www.beedub.com/">Brent Welch</A></H3>
<IMG SRC="http://www.beedub.com/welch.gif">
</BODY>
</HTML>
<H1>Register in my Guestbook</H1>
<UL>
<LI>Name <INPUT TYPE="text" NAME="name" SIZE="40">
<LI>URL <INPUT TYPE="text" NAME="url" SIZE="40">
<P>
If you don't have a home page, you can use an email URL like
"mailto:welch@acm.org"
<LI>Additional HTML to include after your link:
<BR>
I. Tcl Basics
guestbook">
<LI><INPUT TYPE="submit" NAME="update" VALUE="Update my
guestbook entry">
</UL>
</FORM>
</BODY>
</HTML>
#!/bin/sh
# \
exec tclsh "$0" ${1+"$@"}
# source cgilib.tcl from the same directory as newguest.cgi
Cgi_Parse
puts "
<DL>
<DT>Name
<DD>[Cgi_Value name]
<DT>URL
<DD>[Link [Cgi_Value url] [Cgi_Value url]]
</DL>
[Cgi_Value html]
"
Cgi_End
The main idea of the newguest.cgi script is that it saves the data to a file
as a Tcl command that defines an element of the Guestbook array. This lets the
guestbook.cgi script simply load the data by using the Tcl source command.
This trick of storing data as a Tcl script saves us from the chore of defining a new
file format and writing code to parse it. Instead, we can rely on the well-tuned
Tcl implementation to do the hard work for us efficiently.
The script opens the datafile in append mode so that it can add a new
record to the end. Opening files is described in detail on page 110. The script
uses a catch command to guard against errors. If an error occurs, a page explain-
ing the error is returned to the user. Working with files is one of the most com-
mon sources of errors (permission denied, disk full, file-not-found, and so on), so I
always open the file inside a catch statement:
if [catch {open $datafile a} out] {
# an error occurred
} else {
# open was ok
}
In this command, the variable out gets the result of the open command,
which is either a file descriptor or an error message. This style of using catch is
described in detail in Example 6–14 on page 77.
The script writes the data as a Tcl set command. The list command is
used to format the data properly:
puts $out [list set Guestbook([Cgi_Value name]) \
[list [Cgi_Value url] [Cgi_Value html]]]
The cgi.tcl Package 43
I. Tcl Basics
There are two lists. First the url and html values are formatted into one
list. This list will be the value of the array element. Then, the whole Tcl com-
mand is formed as a list. In simplified form, the command is generated from this:
list set variable value
Using the list command ensures that the result will always be a valid Tcl
command that sets the variable to the given value. The list command is
described in more detail on page 61.
Next Steps
There are a number of details that can be added to this example. A user may
want to update their entry, for example. They could do that now, but they would
have to retype everything. They might also like a chance to check the results of
their registration and make changes before committing them. This requires
another page that displays their guest entry as it would appear on a page, and
also has the fields that let them update the data.
The details of how a CGI script is hooked up with a Web server vary from
server to server. You should ask your local Webmaster for help if you want to try
this out on your local Web site. The Tcl Web Server comes with this guestbook
example already set up, plus it has a number of other very interesting ways to
generate pages. My own taste in Web page generation has shifted from CGI to a
template-based approach supported by the Tcl Web Server. This is the topic of
Chapter 18.
The next few chapters describe basic Tcl commands and data structures.
We return to the CGI example in Chapter 11 on regular expressions.
Blank page 44
4
I. Tcl Basics
C H A P T E R
This chapter describes string manipulation and simple pattern matching. Tcl
commands described are: string, append, format, scan, and
binary. The string command is a collection of several useful string
manipulation operations.
45
46 String Processing in Tcl Chap. 4
This trick of feeding a Tcl command bad arguments to find out its usage is
common across many commands. Table 4–1 summarizes the string command.
string bytelength str Returns the number of bytes used to store a string, which
may be different from the character length returned by
string length because of UTF-8 encoding. See page
210 of Chapter 15 about Unicode and UTF-8.
string compare ?-nocase? Compares strings lexicographically. Use -nocase for
?-length len? str1 str2 case insensitve comparison. Use -length to limit the
comparison to the first len characters. Returns 0 if equal,
-1 if str1 sorts before str2, else 1.
string equal ?-nocase? Compares strings and returns 1 if they are the same. Use
str1 str2 -nocase for case insensitve comparison.
string first str1 str2 Returns the index in str2 of the first occurrence of
str1, or -1 if str1 is not found.
string index string index Returns the character at the specified index. An index
counts from zero. Use end for the last character.
string is class ?-strict? Returns 1 if string belongs to class. If -strict,
?-failindex varname? then empty strings never match, otherwise they always
string match. If -failindex is specified, then varname is
assigned the index of the character in string that pre-
vented it from being a member of class. See Table 4–3
on page 50 for character class names.
string last str1 str2 Returns the index in str2 of the last occurrence of
str1, or -1 if str1 is not found.
string length string Returns the number of characters in string.
string map ?-nocase? Returns a new string created by mapping characters in
charMap string string according to the input, output list in charMap.
See page 51.
string match pattern str Returns 1 if str matches the pattern, else 0. Glob-
style matching is used. See page 48.
string range str i j Returns the range of characters in str from i to j.
string repeat str count Returns str repeated count times.
string replace str first Returns a new string created by replacing characters
last ?newstr? first through last with newstr, or nothing.
string tolower string Returns string in lower case. first and last deter-
?first? ?last? mine the range of string on which to operate.
string totitle string Capitalizes string by replacing its first character with
?first? ?last? the Unicode title case, or upper case, and the rest with
lower case. first and last determine the range of
string on which to operate.
The string Command 47
I. Tcl Basics
Table 4–1 The string command. (Continued)
string toupper string Returns string in upper case. first and last deter-
?first? ?last? mine the range of string on which to operate.
string trim string Trims the characters in chars from both ends of
?chars? string. chars defaults to whitespace.
string trimleft string Trims the characters in chars from the beginning of
?chars? string. chars defaults to whitespace.
string trimright string Trims the characters in chars from the end of string.
?chars? chars defaults to whitespace.
string wordend str ix Returns the index in str of the character after the word
containing the character at index ix.
string wordstart str ix Returns the index in str of the first character in the word
containing the character at index ix.
String Indices
Several of the string operations involve string indices that are positions
within a string. Tcl counts characters in strings starting with zero. The special
index end is used to specify the last character in a string:
string range abcd 2 end
=> cd
Tcl 8.1 added syntax for specifying an index relative to the end. Specify
end-N to get the Nth caracter before the end. For example, the following command
returns a new string that drops the first and last characters from the original:
string range $string 1 end-1
48 String Processing in Tcl Chap. 4
There are several operations that pick apart strings: first, last,
wordstart, wordend, index, and range. If you find yourself using combinations of
these operations to pick apart data, it will be faster if you can do it with the reg-
ular expression pattern matcher described in Chapter 11.
The string equal command added in Tcl 8.1 makes this simpler:
String Matching
The string match command implements glob-style pattern matching that
is modeled after the file name pattern matching done by various UNIX shells.
The string Command 49
I. Tcl Basics
The heritage of the word "glob" is rooted in UNIX, and Tcl preserves this histori-
cal oddity in the glob command that does pattern matching on file names. The
glob command is described on page 115. Table 4–2 shows the three constructs
used in string match patterns:
Any other characters in a pattern are taken as literals that must match the
input exactly. The following example matches all strings that begin with a:
string match a* alpha
=> 1
To match all two-letter strings:
string match ?? XY
=> 1
To match all strings that begin with either a or b:
string match {[ab]*} cello
=> 0
Be careful! Square brackets are also special to the Tcl interpreter, so you
will need to wrap the pattern up in curly braces to prevent it from being inter-
preted as a nested command. Another approach is to put the pattern into a vari-
able:
set pat {[ab]*x}
string match $pat box
=> 1
You can specify a range of characters with the syntax [x-y]. For example,
[a-z] represents the set of all lower-case letters, and [0-9] represents all the
digits. You can include more than one range in a set. Any letter, digit, or the
underscore is matched with:
string match {[a-zA-Z0-9_]} $char
The set matches only a single character. To match more complicated pat-
terns, like one or more characters from a set, then you need to use regular
expression matching, which is described on page 148.
If you need to include a literal *, ?, or bracket in your pattern, preface it
with a backslash:
string match {*\?} what?
=> 1
In this case the pattern is quoted with curly braces because the Tcl inter-
preter is also doing backslash substitutions. Without the braces, you would have
50 String Processing in Tcl Chap. 4
to use two backslashes. They are replaced with a single backslash by Tcl before
string match is called.
string match *\\? what?
Character Classes
The string is command tests a string to see whether it belongs to a partic-
ular class. This is useful for input validation. For example, to make sure some-
thing is a number, you do:
if {![string is integer $input]} {
error "Invalid input. Please enter a number."
}
Classes are defined in terms of the Unicode character set, which means
they are more general than specifying character sets with ranges over the ASCII
encoding. For example, alpha includes many characters outside the range of [A-
Za-z] because of different characters in other alphabets. The classes are listed in
Table 4–3.
I. Tcl Basics
Mapping Strings
The string map command translates a string based on a character map.
The map is in the form of a input, output list. Whereever a string contains an
input sequence, that is replaced with the corresponding output. For example:
string map "food" {f p d l}
=> pool
The inputs and outputs can be more than one character and do not have to
be the same length:
string map "food" {f p d ll oo u}
=> pull
Example 4–3 is more practical. It uses string map to replace fancy quotes
and hyphens produced by Microsoft Word into ASCII equivalents. It uses the
open, read, and close file operations that are described in Chapter 9, and the
fconfigure command described on page 223 to ensure that the file format is
UNIX friendly.
• position specifier
• flags
• field width
• precision
• word length
• conversion character
These components are explained by a series of examples. The examples use
double quotes around the format specification. This is because often the format
contains white space, so grouping is required, as well as backslash substitutions
like \t or \n, and the quotes allow substitution of these special characters. Table
4–4 lists the conversion characters:
d Signed integer.
u Unsigned integer.
i Signed integer. The argument may be in hex (0x) or octal (0) format.
o Unsigned octal.
x or X Unsigned hexadecimal. ‘x’ gives lowercase results.
c Map from an integer to the ASCII character it represents.
s A string.
f Floating point number in the format a.b.
The format Command 53
I. Tcl Basics
Table 4–4 Format conversions. (Continued)
A position specifier is i$, which means take the value from argument i as
opposed to the normally corresponding argument. The position counts from 1. If
a position is specified for one format keyword, the position must be used for all of
them. If you group the format specification with double quotes, you need to quote
the $ with a backslash:
set lang 2
format "%${lang}\$s" one un uno
=> un
The position specifier is useful for picking a string from a set, such as this
simple language-specific example. The message catalog facility described in
Chapter 15 is a much more sophisticated way to solve this problem. The position
is also useful if the same value is repeated in the formatted string.
The flags in a format are used to specify padding and justification. In the
following examples, the # causes a leading 0x to be printed in the hexadecimal
value. The zero in 08 causes the field to be padded with zeros. Table 4–5 summa-
rizes the format flag characters.
format "%#x" 20
=> 0x14
format "%#08x" 10
=> 0x0000000a
After the flags you can specify a minimum field width value. The value is
padded to this width with spaces, or with zeros if the 0 flag is used:
format "%-20s %3d" Label 2
=> Label 2
You can compute a field width and pass it to format as one of the arguments
by using * as the field width specifier. In this case the next argument is used as
the field width instead of the value, and the argument after that is the value that
54 String Processing in Tcl Chap. 4
gets formatted.
set maxl 8
format "%-*s = %s" $maxl Key Value
=> Key = Value
The precision comes next, and it is specified with a period and a number.
For %f and %e it indicates how many digits come after the decimal point. For %g it
indicates the total number of significant digits used. For %d and %x it indicates
how many digits will be printed, padding with zeros if necessary.
format "%6.2f %6.2d" 1 1
=> 1.00 01
The storage length part comes last but it is rarely useful because Tcl maintains
all floating point values in double-precision, and all integers as long words.
I. Tcl Basics
This section describes the binary command that provides conversions
between strings and packed binary data representations. The binary format
command takes values and packs them according to a template. For example,
this can be used to format a floating point vector in memory suitable for passing
to Fortran. The resulting binary value is returned:
binary format template value ?value ...?
The binary scan command extracts values from a binary string according
to a similar template. For example, this is useful for extracting data stored in
binary format. It assigns values to a set of Tcl variables:
binary scan value template variable ?variable ...?
Format Templates
The template consists of type keys and counts. The types are summarized
in Table 4–6. In the table, count is the optional count following the type letter.
The count is interpreted differently depending on the type. For types like
integer (i) and double (d), the count is a repetition count (e.g., i3 means three
56 String Processing in Tcl Chap. 4
Examples
When you experiment with binary format and binary scan, remember that
Tcl treats things as strings by default. A "6", for example, is the character 6 with
character code 54 or 0x36. The c type returns these character codes:
set input 6
binary scan $input "c" 6val
set 6val
=> 54
You can scan several character codes at a time:
binary scan abc "c3" list
=> 1
set list
=> 97 98 99
The previous example uses a single type key, so binary scan sets one corre-
sponding Tcl variable. If you want each character code in a separate variable, use
separate type keys:
The binary Command 57
I. Tcl Basics
binary scan abc "ccc" x y z
=> 3
set z
=> 99
Use the H format to get hexadecimal values:
binary scan 6 "H2" 6val
set 6val
=> 36
Use the a and A formats to extract fixed width fields. Here the * count is
used to get all the rest of the string. Note that A trims trailing spaces:
binary scan "hello world " a3x2A* first second
puts "\"$first\" \"$second\""
=> "hel" " world"
Use the @ key to seek to a particular offset in a value. The following com-
mand gets the second double-precision number from a vector. Assume the vector
is read from a binary data file:
binary scan $vector "@8d" double
With binary format, the a and A types create fixed width fields. A pads its
field with spaces, if necessary. The value is truncated if the string is too long:
binary format "A9A3" hello world
=> hello wor
An array of floating point values can be created with this command:
binary format "f*" 1.2 3.45 7.43 -45.67 1.03e4
Remember that floating point values are always in native format, so you
have to read them on the same type of machine that they were created. With
integer data you specify either big-endian or little-endian formats. The
tcl_platform variable described on page 182 can tell you the byte order of the
current platform.
Related Chapters
• To learn more about manipulating data in Tcl, read about lists in Chapter 5
and arrays in Chapter 8.
• For more about pattern matching, read about regular expressions in Chap-
ter 11.
• For more about file I/O, see Chapter 9.
• For information on Unicode and other Internationalization issues, see Chap-
ter 15.
5
I. Tcl Basics
C H A P T E R
Tcl Lists 5
This chapter describes Tcl lists. Tcl commands described are: list, lindex,
llength, lrange, lappend, linsert, lreplace, lsearch, lsort,
concat, join, and split.
Tcl Lists
A Tcl list is a sequence of values. When you write out a list, it has the same syn-
tax as a Tcl command. A list has its elements separated by white space. Braces
or quotes can be used to group words with white space into a single list element.
59
60 Tcl Lists Chap. 5
Because of the relationship between lists and commands, the list-related com-
mands described in this chapter are used often when constructing Tcl com-
mands.
Big lists were often slow before Tcl 8.0.
Unlike list data structures in other languages, Tcl lists are just strings with
a special interpretation. The string representation must be parsed on each list
access, so be careful when you use large lists. A list with a few elements will not
slow down your code much. A list with hundreds or thousands of elements can be
very slow. If you find yourself maintaining large lists that must be frequently
accessed, consider changing your code to use arrays instead.
The performance of lists was improved by the Tcl compiler added in Tcl 8.0.
The compiler stores lists in an internal format that requires constant time to
access. Accessing the first element costs the same as accessing any other element
in the list. Before Tcl 8.0, the cost of accessing an element was proportional to
the number of elements before it in the list. The internal format also records the
number of list elements, so getting the length of a list is cheap. Before Tcl 8.0,
computing the length required reading the whole list.
Table 5–1 briefly describes the Tcl commands related to lists.
list arg1 arg2 ... Creates a list out of all its arguments.
lindex list i Returns the ith element from list.
llength list Returns the number of elements in list.
lrange list i j Returns the ith through jth elements from list.
lappend listVar arg Appends elements to the value of listVar.
arg ...
linsert list index Inserts elements into list before the element at position
arg arg ... index. Returns a new list.
lreplace list i j arg Replaces elements i through j of list with the args. Returns
arg ... a new list.
lsearch ?mode? list Returns the index of the element in list that matches the
value value according to the mode, which is -exact, -glob, or -
regexp. -glob is the default. Returns -1 if not found.
lsort ?switches? Sorts elements of the list according to the switches: -ascii, -
list integer, -real, -dictionary, -increasing,
-decreasing, -index ix, -command command.
Returns a new list.
concat list list ... Joins multiple lists together into one list.
join list joinString Merges the elements of a list together by separating them with
joinString.
split string split- Splits a string up into list elements, using the characters in
Chars splitChars as boundaries between list elements.
Constructing Lists 61
I. Tcl Basics
Constructing Lists
Constructing a list can be tricky because you must maintain proper list syntax.
In simple cases, you can do this by hand. In more complex cases, however, you
should use Tcl commands that take care of quoting so that the syntax comes out
right.
set x {1 2}
=> 1 2
set y foo
=> foo
set l1 [list $x "a b" $y]
=> {1 2} {a b} foo
set l2 "\{$x\} {a b} $y"
=> {1 2} {a b} foo
lappend new 1 2
=> 1 2
lappend new 3 "4 5"
=> 1 2 3 {4 5}
set new
=> 1 2 3 {4 5}
set x {4 5 6}
set y {2 3}
set z 1
concat $z $y $x
=> 1 2 3 4 5 6
Double quotes behave much like the concat command. In simple cases, dou-
ble quotes behave exactly like concat. However, the concat command trims
extra white space from the end of its arguments before joining them together
with a single separating space character. Example 5–4 compares the use of list,
concat, and double quotes:
Getting List Elements: llength, lindex, and lrange 63
I. Tcl Basics
Example 5–4 Double quotes compared to the concat and list commands.
set x {1 2}
=> 1 2
set y "$x 3"
=> 1 2 3
set y [concat $x 3]
=> 1 2 3
set s { 2 }
=> 2
set y "1 $s 3"
=> 1 2 3
set y [concat 1 $s 3]
=> 1 2 3
set z [list $x $s 3]
=> {1 2} { 2 } 3
The distinction between list and concat becomes important when Tcl com-
mands are built dynamically. The basic rule is that list and lappend preserve
list structure, while concat (or double quotes) eliminates one level of list struc-
ture. The distinction can be subtle because there are examples where list and
concat return the same results. Unfortunately, this can lead to data-dependent
bugs. Throughout the examples of this book, you will see the list command used
to safely construct lists. This issue is discussed more in Chapter 10.
I. Tcl Basics
Sorting Lists: lsort
You can sort a list in a variety of ways with lsort. The list is not sorted in place.
Instead, a new list value is returned. The basic types of sorts are specified with
the -ascii, -dictionary, -integer, or -real options. The -increasing or
-decreasing option indicate the sorting order. The default option set is -ascii
-increasing. An ASCII sort uses character codes, and a dictionary sort folds
together case and treats digits like numbers. For example:
lsort -ascii {a Z n2 n100}
=> Z a n100 n2
lsort -dictionary {a Z n2 n100}
=> a n2 n100 Z
You can provide your own sorting function for special-purpose sorting. For
example, suppose you have a list of names, where each element is itself a list con-
taining the person’s first name, middle name (if any), and last name. The default
sorts by everyone’s first name. If you want to sort by their last name, you need to
supply a sorting command.
proc NameCompare {a b} {
set alast [lindex $a end]
set blast [lindex $b end]
set res [string compare $alast $blast]
if {$res != 0} {
return $res
} else {
return [string compare $a $b]
}
}
set list {{Brent B. Welch} {John Ousterhout} {Miles Davis}}
=> {Brent B. Welch} {John Ousterhout} {Miles Davis}
lsort -command NameCompare $list
=> {Miles Davis} {John Ousterhout} {Brent B. Welch}
The NameCompare procedure extracts the last element from each of its argu-
ments and compares those. If they are equal, then it just compares the whole of
each argument.
Tcl 8.0 added a -index option to lsort that can be used to sort lists on an
index. Instead of using NameCompare, you could do this:
lsort -index end $list
Example 5–8 Use split to turn input data into Tcl lists.
The default separator character for split is white space, which contains
spaces, tabs, and newlines. If there are multiple separator characters in a row,
these result in empty list elements; the separators are not collapsed. The follow-
ing command splits on commas, periods, spaces, and tabs. The backslash–space
sequence is used to include a space in the set of characters. You could also group
the argument to split with double quotes:
set line "\tHello, world."
split $line \ ,.\t
=> {} Hello {} world {}
A trick that splits each character into a list element is to specify an empty
string as the split character. This lets you get at individual characters with list
operations:
split abc {}
=> a b c
However, if you write scripts that process data one character at a time, they
may run slowly. Read Chapter 11 about regular expressions for hints on really
efficient string processing.
The join Command 67
I. Tcl Basics
The join Command
The join command is the inverse of split. It takes a list value and reformats it
with specified characters separating the list elements. In doing so, it removes
any curly braces from the string representation of the list that are used to group
the top-level elements. For example:
join {1 {2 3} {4 5 6}} :
=> 1:2 3:4 5 6
If the treatment of braces is puzzling, remember that the first value is
parsed into a list. The braces around element values disappear in the process.
Example 5–9 shows a way to implement join in a Tcl procedure, which may help
to understand the process:
Related Chapters
• Arrays are the other main data structure in Tcl. They are described in
Chapter 8.
• List operations are used when generating Tcl code dynamically. Chapter 10
describes these techniques when using the eval command.
• The foreach command loops over the values in a list. It is described on page
73 in Chapter 6.
Blank page 68
6
I. Tcl Basics
C H A P T E R
This chapter describes the Tcl commands that implement control structures:
if, switch, foreach, while, for, break, continue, catch, error,
and return.
69
70 Control Structure Commands Chap. 6
If Then Else
The if command is the basic conditional command. If an expression is true, then
execute one command body; otherwise, execute another command body. The sec-
ond command body (the else clause) is optional. The syntax of the command is:
if expression ?then? body1 ?else? ?body2?
The then and else keywords are optional. In practice, I omit then but use
else as illustrated in the next example. I always use braces around the com-
mand bodies, even in the simplest cases:
if {$x == 0} {
puts stderr "Divide by zero!"
} else {
set slope [expr $y/$x]
}
I. Tcl Basics
You can create chained conditionals by using the elseif keyword. Again,
note the careful placement of curly braces that create a single if command:
if {$key < 0} {
incr range 1
} elseif {$key == 0} {
return $range
} else {
incr range -1
}
Switch
The switch command is used to branch to one of many command bodies depend-
ing on the value of an expression. The choice can be made on the basis of pattern
matching as well as simple comparisons. Pattern matching is discussed in more
detail in Chapter 4 and Chapter 11. The general form of the command is:
switch flags value pat1 body1 pat2 body2 ...
Any number of pattern-body pairs can be specified. If multiple patterns
match, only the body of the first matching pattern is evaluated. You can also
group all the pattern-body pairs into one argument:
switch flags value { pat1 body1 pat2 body2 ... }
The first form allows substitutions on the patterns but will require back-
slashes to continue the command onto multiple lines. This is shown in Example
6–4 on page 72. The second form groups all the patterns and bodies into one
argument. This makes it easy to group the whole command without worrying
about newlines, but it suppresses any substitutions on the patterns. This is
shown in Example 6–3. In either case, you should always group the command
bodies with curly braces so that substitution occurs only on the body with the
pattern that matches the value.
There are four possible flags that determine how value is matched.
-exact Matches the value exactly to one of the patterns. This is the default.
-glob Uses glob-style pattern matching. See page 48.
-regexp Uses regular expression pattern matching. See page 134.
-- No flag (or end of flags). Necessary when value can begin with -.
The switch command raises an error if any other flag is specified or if the
value begins with -. In practice I always use the -- flag before value so that I
don’t have to worry about that problem.
If the pattern associated with the last body is default, then this command
72 Control Structure Commands Chap. 6
body is executed if no other patterns match. The default keyword works only on
the last pattern-body pair. If you use the default pattern on an earlier body, it
will be treated as a pattern to match the literal string default:
In this example, the first and second patterns have substitutions performed
to replace $key with its value and \t with a tab character. The third pattern is
quoted with curly braces to prevent command substitution; square brackets are
part of the regular expression syntax, too. (See page Chapter 11.)
If the body associated with a pattern is just a dash, -, then the switch com-
mand “falls through” to the body associated with the next pattern. You can tie
together any number of patterns in this manner.
I. Tcl Basics
Example 6–6 Comments in switch commands.
switch -- $value {
# this comment confuses switch
pattern { # this comment is ok }
}
While
The while command takes two arguments, a test and a command body:
while booleanExpr body
The while command repeatedly tests the boolean expression and then exe-
cutes the body if the expression is true (nonzero). Because the test expression is
evaluated again before each iteration of the loop, it is crucial to protect the
expression from any substitutions before the while command is invoked. The fol-
lowing is an infinite loop (see also Example 1–13 on page 12):
set i 0 ; while $i<10 {incr i}
The following behaves as expected:
set i 0 ; while {$i<10} {incr i}
It is also possible to put nested commands in the boolean expression. The
following example uses gets to read standard input. The gets command returns
the number of characters read, returning -1 upon end of file. Each time through
the loop, the variable line contains the next line in the file:
Foreach
The foreach command loops over a command body assigning one or more loop
variables to each of the values in one or more lists. Multiple loop variables were
introduced in Tcl 7.5. The syntax for the simple case of a single variable and a
single list is:
foreach loopVar valueList commandBody
The first argument is the name of a variable, and the command body is exe-
cuted once for each element in the list with the loop variable taking on successive
values in the list. The list can be entered explicitly, as in the next example:
74 Control Structure Commands Chap. 6
set i 1
foreach value {1 3 5 7 11 13 17 19 23} {
set i [expr $i*$value]
}
set i
=> 111546435
The loop uses the state variable to keep track of what is expected next,
which in this example is either a flag or the integer value for -max. The -- flag to
switch is required in this example because the switch command complains
about a bad flag if the pattern begins with a - character. The -glob option lets
the user abbreviate the -force and -verbose options.
If the list of values is to contain variable values or command results, then
the list command should be used to form the list. Avoid double quotes because if
any values or command results contain spaces or braces, the list structure will be
reparsed, which can lead to errors or unexpected results.
Foreach 75
I. Tcl Basics
Example 6–10 Using list with foreach.
The loop variable x will take on the value of a, the value of b, and the result
of the foo command, regardless of any special characters or whitespace in those
values.
If you have a command that returns a short list of values, then you can
abuse the foreach command to assign the results of the commands to several
variables all at once. For example, suppose the command MinMax returns two val-
ues as a list: the minimum and maximum values. Here is one way to get the val-
ues:
set result [MinMax $list]
set min [lindex $result 0]
set max [lindex $result 1]
The foreach command lets us do this much more compactly:
foreach {min max} [MinMax $list] {break}
The break in the body of the foreach loop guards against the case where
the command returns more values than we expected. This trick is encapsulated
into the lassign procedure in Example 10–4 on page 131.
76 Control Structure Commands Chap. 6
foreach {k1 k2} {orange blue red green black} value {55 72 24} {
puts "$k1 $k2: $value"
}
orange blue: 55
red green: 72
black : 24
For
The for command is similar to the C for statement. It takes four arguments:
for initial test final body
The first argument is a command to initialize the loop. The second argu-
ment is a boolean expression that determines whether the loop body will execute.
The third argument is a command to execute after the loop body:
You could use for to iterate over a list, but you should really use foreach
instead. Code like the following is slow and cluttered:
for {set i 0} {$i < [llength $list]} {incr i} {
set value [lindex $list $i]
}
This is the same as:
foreach value $list {
}
Break and Continue 77
I. Tcl Basics
Break and Continue
You can control loop execution with the break and continue commands. The
break command causes immediate exit from a loop, while the continue com-
mand causes the loop to continue with the next iteration. There is no goto com-
mand in Tcl.
Catch
Until now we have ignored the possibility of errors. In practice, however, a com-
mand will raise an error if it is called with the wrong number of arguments, or if
it detects some error condition particular to its implementation. An uncaught
error aborts execution of a script.* The catch command is used to trap such
errors. It takes two arguments:
catch command ?resultVar?
The first argument to catch is a command body. The second argument is
the name of a variable that will contain the result of the command, or an error
message if the command raises an error. catch returns zero if there was no error
caught, or a nonzero error code if it did catch an error.
You should use curly braces to group the command instead of double quotes
because catch invokes the full Tcl interpreter on the command. If double quotes
are used, an extra round of substitutions occurs before catch is even called. The
simplest use of catch looks like the following:
catch { command }
A more careful catch phrase saves the result and prints an error message:
A more general catch phrase is shown in the next example. Multiple com-
mands are grouped into a command body. The errorInfo variable is set by the
Tcl interpreter after an error to reflect the stack trace from the point of the error:
*
More precisely, the Tcl script unwinds and the current Tcl_Eval procedure in the C runtime library
returns TCL_ERROR. There are three cases. In interactive use, the Tcl shell prints the error message. In Tk, errors
that arise during event handling trigger a call to bgerror, a Tcl procedure you can implement in your application.
In your own C code, you should check the result of Tcl_Eval and take appropriate action in the case of an error.
78 Control Structure Commands Chap. 6
if {[catch {
command1
command2
command3
} result]} {
global errorInfo
puts stderr $result
puts stderr "*** Tcl TRACE ***"
puts stderr $errorInfo
} else {
# command body ok, result of last command is in result
}
These examples have not grouped the call to catch with curly braces. This
is acceptable because catch always returns an integer, so the if command will
parse correctly. However, if we had used while instead of if, then curly braces
would be necessary to ensure that the catch phrase was evaluated repeatedly.
Example 6–16 There are several possible return values from catch.
switch [catch {
command1
command2
...
} result] {
0 { # Normal completion }
1 { # Error case }
2 { return $result ;# return from procedure}
3 { break ;# break out of the loop}
4 { continue ;# continue loop}
default { # User-defined error codes }
}
Error 79
I. Tcl Basics
Error
The error command raises an error condition that terminates a script unless it
is trapped with the catch command. The command takes up to three arguments:
error message ?info? ?code?
The message becomes the error message stored in the result variable of the
catch command.
If the info argument is provided, then the Tcl interpreter uses this to ini-
tialize the errorInfo global variable. That variable is used to collect a stack
trace from the point of the error. If the info argument is not provided, then the
error command itself is used to initialize the errorInfo trace.
proc foo {} {
error bogus
}
foo
=> bogus
set errorInfo
=> bogus
while executing
"error bogus"
(procedure "foo" line 2)
invoked from within
"foo"
In the previous example, the error command itself appears in the trace.
One common use of the info argument is to preserve the errorInfo that is avail-
able after a catch. In the next example, the information from the original error is
preserved:
Return
The return command is used to return from a procedure. It is needed if return is
to occur before the end of the procedure body, or if a constant value needs to be
returned. As a matter of style, I also use return at the end of a procedure, even
though a procedure returns the value of the last command executed in the body.
Exceptional return conditions can be specified with some optional argu-
ments to return. The complete syntax is:
return ?-code c? ?-errorinfo i? ?-errorcode ec? string
The -code option value is one of ok, error, return, break, continue, or an
integer. ok is the default if -code is not specified.
The -code error option makes return behave much like the error com-
mand. The -errorcode option sets the global errorCode variable, and the
-errorinfo option initializes the errorInfo global variable. When you use
return -code error, there is no error command in the stack trace. Compare
Example 6–17 with Example 6–19:
proc bar {} {
return -code error bogus
}
catch {bar} result
=> 1
set result
=> bogus
set errorInfo
=> bogus
while executing
"bar"
The return, break, and continue code options take effect in the caller of the
procedure doing the exceptional return. If -code return is specified, then the
calling procedure returns. If -code break is specified, then the calling procedure
breaks out of a loop, and if -code continue is specified, then the calling proce-
dure continues to the next iteration of the loop. These -code options to return
enable the construction of new control structures entirely in Tcl. The following
example implements the break command with a Tcl procedure:
proc break {} {
return -code break
}
7
I. Tcl Basics
C H A P T E R
81
82 Procedures and Scope Chap. 7
Procedures can have default parameters so that the caller can leave out
some of the command arguments. A default parameter is specified with its name
and default value, as shown in the next example:
proc P2 {a {b 7} {c -2} } {
expr $a / $b + $c
}
P2 6 3
=> 0
Here the procedure P2 can be called with one, two, or three arguments. If it
is called with only one argument, then the parameters b and c take on the values
specified in the proc command. If two arguments are provided, then only c gets
the default value, and the arguments are assigned to a and b. At least one argu-
ment and no more than three arguments can be passed to P2.
A procedure can take a variable number of arguments by specifying the
args keyword as the last parameter. When the procedure is called, the args
parameter is a list that contains all the remaining values:
I. Tcl Basics
passed as the value of parameter b, its value comes through to the procedure
unchanged. When $z is part of the optional parameters, quoting is automatically
added to create a valid Tcl list as the value of args. Example 10–3 on page 127
illustrates a technique that uses eval to undo the effect of the added list struc-
ture.
Scope
By default there is a single, global scope for procedure names. This means that
you can use a procedure anywhere in your script. Variables defined outside any
procedure are global variables. However, as described below, global variables are
not automatically visible inside procedures. There is a different namespace for
variables and procedures, so you could have a procedure and a global variable
with the same name without conflict. You can use the namespace facility
described in Chapter 7 to manage procedures and global variables.
Each procedure has a local scope for variables. That is, variables introduced
in the procedure live only for the duration of the procedure call. After the proce-
dure returns, those variables are undefined. Variables defined outside the proce-
dure are not visible to a procedure unless the upvar or global scope commands
are used. You can also use qualified names to name variables in a namespace
scope. The global and upvar commands are described later in this chapter. Qual-
ified names are described on page 198. If the same variable name exists in an
outer scope, it is unaffected by the use of that variable name inside a procedure.
In Example 7–3, the variable a in the global scope is different from the
parameter a to P1. Similarly, the global variable b is different from the variable b
inside P1:
84 Procedures and Scope Chap. 7
set a 5
set b -8
proc P1 {a} {
set b 42
if {$a < 0} {
return $b
} else {
return $a
}
}
P1 $b
=> 42
P1 [expr $a*2]
=> 10
I. Tcl Basics
Example 7–4 A random number generator.*
*
Adapted from Exploring Expect by Don Libes, O’Reilly & Associates, Inc., 1995, and from Numerical Rec-
ipes in C by Press et al., Cambridge University Press, 1988.
86 Procedures and Scope Chap. 7
You can use upvar to fix the incr command. One drawback of the built-in
incr is that it raises an error if the variable does not exist. We can define a new
version of incr that initializes the variable if it does not already exist:
I. Tcl Basics
upvar #0 state$name state
Your code can pass name around as a handle on an object, then use upvar to
get access to the data associated with the object. Your code is just written to use
the state variable, which is an alias to the state variable for the current object.
This technique is illustrated in Example 17–7 on page 232.
I. Tcl Basics
C H A P T E R
Tcl Arrays 8
This chapter describes Tcl arrays, which provide a flexible mechanism to build
many other data structures in Tcl. Tcl command described is: array.
Array Syntax
The index of an array is delimited by parentheses. The index can have any string
value, and it can be the result of variable or command substitution. Array ele-
ments are defined with set:
set arr(index) value
The value of an array element is obtained with $ substitution:
set foo $arr(index)
Example 8–1 uses the loop variable value $i as an array index. It sets
arr(x) to the product of 1 * 2 * ... * x:
89
90 Tcl Arrays Chap. 8
set arr(0) 1
for {set i 1} {$i <= 10} {incr i} {
set arr($i) [expr {$i * $arr([expr $i-1])}]
}
Complex Indices
An array index can be any string, like orange, 5, 3.1415, or foo,bar. The
examples in this chapter, and in this book, often use indices that are pretty com-
plex strings to create flexible data structures. As a rule of thumb, you can use
any string for an index, but avoid using a string that contains spaces.
Parentheses are not a grouping mechanism.
The main Tcl parser does not know about array syntax. All the rules about
grouping and substitution described in Chapter 1 are still the same in spite of
the array syntax described here. Parentheses do not group like curly braces or
quotes, which is why a space causes problems. If you have complex indices, use a
comma to separate different parts of the index. If you use a space in an index
instead, then you have a quoting problem. The space in the index needs to be
quoted with a backslash, or the whole variable reference needs to be grouped:
set {arr(I’m asking for trouble)} {I told you so.}
set arr(I’m\ asking\ for\ trouble) {I told you so.}
If the array index is stored in a variable, then there is no problem with
spaces in the variable’s value. The following works well:
set index {I’m asking for trouble}
set arr($index) {I told you so.}
Array Variables
You can use an array element as you would a simple variable. For example,
you can test for its existence with info exists, increment its value with incr,
and append elements to it with lappend:
if {[info exists stats($event)]} {incr stats($event)}
You can delete an entire array, or just a single array element with unset.
Using unset on an array is a convenient way to clear out a big data structure.
It is an error to use a variable as both an array and a normal variable. The
following is an error:
set arr(0) 1
set arr 3
=> can’t set "arr": variable is array
The name of the array can be the result of a substitution. This is a tricky
situation, as shown in Example 8–2:
The array Command 91
I. Tcl Basics
Example 8–2 Referencing an array indirectly.
A better way to deal with this situation is to use the upvar command, which
is introduced on page 79. The previous example is much cleaner when upvar is
used:
Example 8–3 Referencing an array indirectly using upvar.
Another way to loop through the contents of an array is to use array get
and the two-variable form of the foreach command.
foreach {key value} [array get fruit] {
# key is ok, best, or worst
# value is some fruit
}
I. Tcl Basics
array. As with array names, you can specify a pattern to array get to limit what
part of the array is returned. This example uses upvar because the array names
are passed into the ArrayInvert procedure. The inverse array does not need to
exist before you call ArrayInvert.
Simple Records
Suppose we have a database of information about people. One approach
uses a different array for each class of information. The name of the person is the
index into each array:
global employeeManager
return $employeeManager($name)
}
Simple procedures are defined to return fields of the record, which hides the
implementation so that you can change it more easily. The employeeName array
provides a secondary key. It maps from the employee ID to the name so that the
other information can be obtained if you have an ID instead of a name. Another
way to implement the same little database is to use a single array with more
complex indices:
A Stack
A stack can be implemented with either a list or an array. If you use a list,
then the push and pop operations have a runtime cost that is proportional to the
size of the stack. If the stack has a few elements this is fine. If there are a lot of
items in a stack, you may wish to use arrays instead.
I. Tcl Basics
In these examples, the name of the stack is a parameter, and upvar is used
to convert that into the data used for the stack. The variable is a list in Example
8–7 and an array in Example 8–8. The user of the stack module does not have to
know.
The array implementation of a stack uses one array element to record the
number of items in the stack. The other elements of the array have the stack val-
ues. The Push and Pop procedures both guard against a nonexistent array with
the info exists command. When the first assignment to S(top) is done by Push,
the array variable is created in the caller’s scope. The example uses array indices
in two ways. The top index records the depth of the stack. The other indices are
numbers, so the construct $S($S(top)) is used to reference the top of the stack.
Example 8–8 Using an array to implement a stack.
A List of Arrays
Suppose you have many arrays, each of which stores some data, and you
want to maintain an overall ordering among the data sets. One approach is to
keep a Tcl list with the name of each array in order. Example 8–9 defines Recor-
dInsert to add an array to the list, and an iterator function, RecordIterate, that
applies a script to each array in order. The iterator uses upvar to make data an
alias for the current array. The script is executed with eval, which is described in
detail in Chapter 10. The Tcl commands in script can reference the arrays with
the name data:
96 Tcl Arrays Chap. 8
I. Tcl Basics
foreach key $keylist {
lappend Db($key) $datablob
}
The problem with this approach is that it duplicates the data chunks under
each key. A better approach is to use two arrays. One stores all the data chunks
under a simple ID that is generated automatically. The other array stores the
association between the keys and the data chunks. Example 8–11, which uses
the namespace syntax described in Chapter 14, illustrates this approach. The
example also shows how you can easily dump data structures by writing array
set commands to a file, and then load them later with a source command:
namespace eval db {
variable data ;# Array of data blobs
variable uid 0 ;# Index into data
variable index ;# Cross references into data
}
proc db::insert {keylist datablob} {
variable data
variable uid
variable index
set data([incr uid]) $datablob
foreach key $keylist {
lappend index($key) $uid
}
}
proc db::get {key} {
variable data
variable index
set result {}
if {![info exist index($key)]} {
return {}
}
foreach uid $index($key) {
lappend result $data($uid)
}
return $result
}
proc db::save {filename} {
variable uid
set out [open $filename w]
puts $out [list namespace eval db \
[list variable uid $uid]]
puts $out [list array set db::data [array get db::data]]
puts $out [list array set db::index [array get db::index]]
close $out
}
proc db::load {filename} {
source $filename
}
Blank page 98
9
I. Tcl Basics
C H A P T E R
This chapter describes how to run programs, examine the file system, and access
environment variables through the env array. Tcl commands described are:
exec, file, open, close, read, write, puts, gets, flush, seek,
tell, glob, pwd, cd, exit, pid, and registry.
T
his chapter describes how to run pro-
grams and access the file system from Tcl. These commands were designed for
UNIX. In Tcl 7.5 they were implemented in the Tcl ports to Windows and Macin-
tosh. There are facilities for naming files and manipulating file names in a plat-
form-independent way, so you can write scripts that are portable across systems.
These capabilities enable your Tcl script to be a general-purpose glue that
assembles other programs into a tool that is customized for your needs.
*
Unlike other UNIX shell exec commands, the Tcl exec does not replace the current process with the new
one. Instead, the Tcl library forks first and executes the program as a child process.
99
100 Working with Files and Programs Chap. 9
The exec command supports a full set of I/O redirection and pipeline syn-
tax. Each process normally has three I/O channels associated with it: standard
input, standard output, and standard error. With I/O redirection, you can divert
these I/O channels to files or to I/O channels you have opened with the Tcl open
command. A pipeline is a chain of processes that have the standard output of one
command hooked up to the standard input of the next command in the pipeline.
Any number of programs can be linked together into a pipeline.
Example 9–1 uses exec to run three programs in a pipeline. The first pro-
gram is sort, which takes its input from the file /etc/passwd. The output of
sort is piped into uniq, which suppresses duplicate lines. The output of uniq is
piped into wc, which counts the lines. The error output of the command is
diverted to the null device to suppress any error messages. Table 9–1 provides a
summary of the syntax understood by the exec command.
-keepnewline (First argument.) Do not discard trailing newline from the result.
| Pipes standard output from one process into another.
|& Pipes both standard output and standard error output.
< fileName Takes input from the named file.
<@ fileId Takes input from the I/O channel identified by fileId.
<< value Takes input from the given value.
> fileName Overwrites fileName with standard output.
2> fileName Overwrites fileName with standard error output.
>& fileName Overwrites fileName with both standard error and standard out.
>> fileName Appends standard output to the named file.
2>> fileName Appends standard error to the named file.
>>& fileName Appends both standard error and standard output to the named file.
>@ fileId Directs standard output to the I/O channel identified by fileId.
2>@ fileId Directs standard error to the I/O channel identified by fileId.
>&@ fileId Directs both standard error and standard output to the I/O channel.
& As the last argument, indicates pipeline should run in background.
Running Programs with exec 101
I. Tcl Basics
A trailing & causes the program to run in the background. In this case, the
process identifier is returned by the exec command. Otherwise, the exec com-
mand blocks during execution of the program, and the standard output of the
program is the return value of exec. The trailing newline in the output is
trimmed off, unless you specify -keepnewline as the first argument to exec.
If you look closely at the I/O redirection syntax, you’ll see that it is built up
from a few basic building blocks. The basic idea is that | stands for pipeline, > for
output, and < for input. The standard error is joined to the standard output by &.
Standard error is diverted separately by using 2>. You can use your own I/O
channels by using @.
AppleScript on Macintosh
The exec command is not provided on the Macintosh. Tcl ships with an
AppleScript extension that lets you control other Macintosh applications. You
can find documentation in the AppleScript.html that goes with the distribution.
You must use package require to load the AppleScript command:
102 Working with Files and Programs Chap. 9
I. Tcl Basics
Table 9–2 The file command options. (Continued)
file nativename name Returns the platform-native version of name. (Tk 8.0).
file owned name Returns 1 if current user owns the file name, else 0.
file pathtype name relative, absolute, or driverelative. (Tcl 7.5)
file readable name Returns 1 if name has read permission, else 0.
file readlink name Returns the contents of the symbolic link name.
file rename ?-force? Changes the name of old to new. (Tcl 7.6)
old new
file rootname name Returns all but the extension of name (i.e., up to but not includ-
ing the last . in name).
file size name Returns the number of bytes in name.
file split name Splits name into its pathname components. (Tcl 7.5)
file stat name var Places attributes of name into array var. The elements defined
for var are listed in Table 9–3.
file tail name Returns the last pathname component of name.
file type name Returns type identifier, which is one of: file, directory,
characterSpecial, blockSpecial, fifo, link, or
socket.
file writable name Returns 1 if name has write permission, else 0.
The good news is that Tcl provides operations that let you deal with file
pathnames in a platform-independent manner. The file operations described in
this chapter allow either native format or the UNIX naming convention. The
backslash used in Windows pathnames is especially awkward because the back-
slash is special to Tcl. Happily, you can use forward slashes instead:
c:/Program Files/Tcl/lib/Tcl7.6
There are some ambiguous cases that can be specified only with native
pathnames. On my Macintosh, Tcl and Tk are installed in a directory that has a
104 Working with Files and Programs Chap. 9
slash in it. You can name it only with the native Macintosh name:
Disk:Applications:Tcl/Tk 4.2
Another construct to watch out for is a leading // in a file name. This is the
Windows syntax for network names that reference files on other computers. You
can avoid accidentally constructing a network name by using the file join com-
mand described next. Of course, you can use network names to access remote
files.
If you must communicate with external programs, you may need to con-
struct a file name in the native syntax for the current platform. You can con-
struct these names with file join described later. You can also convert a UNIX-
like name to a native name with file nativename.
Several of the file operations operate on pathnames as opposed to return-
ing information about the file itself. You can use the dirname, extension, join,
pathtype, rootname, split, and tail operations on any string; there is no
requirement that the pathnames refer to an existing file.
I. Tcl Basics
fier. The absolute name overrides the previous relative name:
file join a b:c d
=> b:c:d
The file join operation converts UNIX-style pathnames to native format.
For example, on Macintosh you get this:
file join /usr/local/lib
=> usr:local:lib
these things, except on Macintosh, where cp, rm, mv, mkdir, and rmdir were built
in. These commands are no longer supported on the Macintosh. Your scripts
should use the file command operations described below to manipulate files in
a platform-independent way.
File name patterns are not directly supported by the file operations.
Instead, you can use the glob command described on page 115 to get a list of file
names that match a pattern.
Copying Files
The file copy operation copies files and directories. The following example
copies file1 to file2. If file2 already exists, the operation raises an error
unless the -force option is specified:
file copy ?-force? file1 file2
Several files can be copied into a destination directory. The names of the
source files are preserved. The -force option indicates that files under direc-
tory can be replaced:
file copy ?-force? file1 file2 ... directory
Directories can be recursively copied. The -force option indicates that files
under dir2 can be replaced:
file copy ?-force? dir1 dir2
Creating Directories
The file mkdir operation creates one or more directories:
file mkdir dir dir ...
It is not an error if the directory already exists. Furthermore, intermediate
directories are created if needed. This means that you can always make sure a
directory exists with a single mkdir operation. Suppose /tmp has no subdirecto-
ries at all. The following command creates /tmp/sub1 and /tmp/sub1/sub2:
file mkdir /tmp/sub1/sub2
The -force option is not understood by file mkdir, so the following com-
mand accidentally creates a folder named -force, as well as one named oops.
file mkdir -force oops
Deleting Files
The file delete operation deletes files and directories. It is not an error if
the files do not exist. A non-empty directory is not deleted unless the -force
option is specified, in which case it is recursively deleted:
file delete ?-force? name name ...
To delete a file or directory named -force, you must specify a nonexistent
file before the -force to prevent it from being interpreted as a flag (-force
-force won’t work):
File Attributes 107
I. Tcl Basics
file delete xyzzy -force
File Attributes
There are several file operations that return specific file attributes: atime, exe-
cutable, exists, isdirectory, isfile, mtime, owned, readable, readlink, size
and type. Refer to Table 9–2 on page 102 for their function. The following com-
mand uses file mtime to compare the modify times of two files. If you have ever
resorted to piping the results of ls -l into awk in order to derive this information
in other shell scripts, you will appreciate this example:
The stat and lstat operations return a collection of file attributes. They
take a third argument that is the name of an array variable, and they initialize
that array with elements that contain the file attributes. If the file is a symbolic
link, then the lstat operation returns information about the link itself and the
stat operation returns information about the target of the link. The array ele-
ments are listed in Table 9–3. All the element values are decimal strings, except
for type, which can have the values returned by the type option. The element
names are based on the UNIX stat system call. Use the file attributes com-
mand described later to get other platform-specific attributes:
108 Working with Files and Programs Chap. 9
Example 9–3 uses the device (dev) and inode (ino) attributes of a file to
determine whether two pathnames reference the same file. The attributes are
UNIX specific; they are not well defined on Windows and Macintosh.
The file attributes operation was added in Tcl 8.0 to provide access to
platform-specific attributes. The attributes operation lets you set and query
attributes. The interface uses option-value pairs. With no options, all the current
values are returned.
file attributes book.doc
=> -creator FRAM -hidden 0 -readonly 0 -type MAKR
These Macintosh attributes are explained in Table 9–4. The four-character
type codes used on Macintosh are illustrated on page 514. With a single option,
only that value is returned:
file attributes book.doc -readonly
=> 0
The attributes are modified by specifying one or more option–value pairs.
Setting attributes can raise an error if you do not have the right permissions:
file attributes book.doc -readonly 1 -hidden 0
Input/Output Command Summary 109
I. Tcl Basics
Table 9–4 Platform-specific file attributes.
-permissions File permission bits. mode is a number with bits defined by the chmod
mode system call. (UNIX)
-group ID The group owner of the file. (UNIX)
-owner ID The owner of the file. (UNIX)
-archive bool The archive bit, which is set by backup programs. (Windows)
-hidden bool If set, then the file does not appear in listings. (Windows, Macintosh)
-readonly bool If set, then you cannot write the file. (Windows, Macintosh)
-system bool If set, then you cannot remove the file. (Windows)
-creator type type is 4-character code of creating application. (Macintosh)
-type type type is 4-character type code. (Macintosh)
I. Tcl Basics
The permissions argument is a value used for the permission bits on a
newly created file. UNIX uses three bits each for the owner, group, and everyone
else. The bits specify read, write, and execute permission. These bits are usually
specified with an octal number, which has a leading zero, so that there is one
octal digit for each set of bits. The default permission bits are 0666, which grant
read/write access to everybody. Example 9–4 specifies 0600 so that the file is
readable and writable only by the owner. 0775 would grant read, write, and exe-
cute permissions to the owner and group, and read and execute permissions to
everyone else. You can set other special properties with additional high-order
bits. Consult the UNIX manual page on chmod command for more details.
The following example illustrates how to use a list of POSIX access flags to
open a file for reading and writing, creating it if needed, and not truncating it.
This is something you cannot do with the simpler form of the access argument:
set fileId [open /tmp/bar {RDWR CREAT}]
Catch errors from open.
In general, you should check for errors when opening files. The following
example illustrates a catch phrase used to open files. Recall that catch returns 1
if it catches an error; otherwise, it returns zero. It treats its second argument as
the name of a variable. In the error case, it puts the error message into the vari-
able. In the normal case, it puts the result of the command into the variable:
You can open a pipeline for both read and write by specifying the r+ access
mode. In this case, you need to worry about buffering. After a puts, the data may
112 Working with Files and Programs Chap. 9
still be in a buffer in the Tcl library. Use the flush command to force the data out
to the spawned processes before you try to read any output from the pipeline.
You can also use the fconfigure command described on page 223 to force line
buffering. Remember that read-write pipes will not work at all with Windows 3.1
because pipes are simulated with files. Event-driven I/O is also very useful with
pipes. It means you can do other processing while the pipeline executes, and sim-
ply respond when the pipe generates data. This is described in Chapter 16.
Expect
If you are trying to do sophisticated things with an external application,
you will find that the Expect extension provides a much more powerful interface
than a process pipeline. Expect adds Tcl commands that are used to control inter-
active applications. It is extremely useful for automating FTP, Telnet, and pro-
grams under test. It comes as a Tcl shell named expect, and it is also an
extension that you can dynamically load into other Tcl shells. It was created by
Don Libes at the National Institute of Standards and Technology (NIST). Expect
is described in Exploring Expect (Libes, O’Reilly & Associates, Inc., 1995). You
can find the software on the CD and on the web at:
http://expect.nist.gov/
I. Tcl Basics
Example 9–7 Prompting for input.
The gets command reads a line of input, and it has two forms. In the previ-
ous example, with just a single argument, gets returns the line read from the
specified I/O channel. It discards the trailing newline from the return value. If
end of file is reached, an empty string is returned. You must use the eof com-
mand to tell the difference between a blank line and end-of-file. eof returns 1 if
there is end of file. Given a second varName argument, gets stores the line into a
named variable and returns the number of bytes read. It discards the trailing
newline, which is not counted. A -1 is returned if the channel has reached the
end of file.
For moderate-sized files, it is about 10 percent faster to loop over the lines
in a file using the read loop in the second example. In this case, read returns the
whole file, and split chops the file into list elements, one for each line. For small
files (less than 1K) it doesn’t really matter. For large files (megabytes) you might
induce paging with this approach.
114 Working with Files and Programs Chap. 9
I. Tcl Basics
Closing I/O Channels
The close command is just as important as the others because it frees oper-
ating system resources associated with the I/O channel. If you forget to close a
channel, it will be closed when your process exits. However, if you have a long-
running program, like a Tk script, you might exhaust some operating system
resources if you forget to close your I/O channels.
The close command can raise an error.
If the channel was a process pipeline and any of the processes wrote to their
standard error channel, then Tcl believes this is an error. The error is raised
when the channel to the pipeline is finally closed. Similarly, if any of the pro-
cesses in the pipeline exit with a nonzero status, close raises an error.
I. Tcl Basics
Environment Variables
Environment variables are a collection of string-valued variables associated with
each process. The process’s environment variables are available through the glo-
bal array env. The name of the environment variable is the index, (e.g.,
env(PATH)), and the array element contains the current value of the environ-
ment variable. If assignments are made to env, they result in changes to the cor-
responding environment variable. Environment variables are inherited by child
processes, so programs run with the exec command inherit the environment of
the Tcl script. The following example prints the values of environment variables.
The keys are organized into a hierarchical naming system, so another way to
think of the value names is as an extra level in the hierarchy. The main point is
that you need to specify both a key name and a value name in order to get some-
thing out of the registry. The key names have one of the following formats:
\\hostname\rootname\keypath
rootname\keypath
rootname
The rootname is one of HKEY_LOCAL_MACHINE, HKEY_PERFORMANCE_DATA,
HKEY_USERS, HKEY_CLASSES_ROOT, HKEY_CURRENT_USER, HKEY_CURRENT_CONFIG, or
HKEY_DYN_DATA. Tables 9–8 and 9–9 summarize the registry command and
data types:
registry delete key Deletes the key and the named value, or it deletes all val-
?valueName? ues under the key if valueName is not specified.
registry get key Returns the value associated with valueName under
valueName key.
registry keys key ?pat? Returns the list of keys or value names under key that
match pat, which is a string match pattern.
registry set key Creates key.
registry set key Creates valueName under key with value data of the
valueName data ?type? given type. Types are listed in Table 9–9.
registry type key Returns the type of valueName under key.
valueName
registry values key ?pat? Returns the names of the values stored under key that
match pat, which is a string match pattern.