slang.1
slang.1
slang.1
0)
John E. Davis <www.jedsoft.org> Sep 14, 2014
ii
Preface
S-Lang is an interpreted language that was designed from the start to be easily embedded into a
program to provide it with a powerful extension language. Examples of programs that use S-Lang
as an extension language include the jed text editor and the slrn newsreader. Although S-Lang
does not exist as a separate application, it is distributed with a quite capable program called slsh
(slang-shell) that embeds the interpreter and allows one to execute S-Lang scripts, or simply
experiment with S-Lang at an interactive prompt. Many of the the examples in this document are
S-Lang is also a programmer's library that permits a programmer to develop sophisticated platform-
independent software. In addition to providing the S-Lang interpreter, the library provides facilities
for screen management, keymaps, low-level terminal I/O, etc. However, this document is concerned
only with the extension language and does not address these other features of the S-Lang library.
For information about the other components of the library, the reader is referred to the S-Lang
a text editor ( jed), which I wanted to endow with a macro language. It occurred to me that an
application-independent language that could be embedded into the editor would prove more useful
because I could envision embedding it into other programs. As a result, S-Lang was born.
S-Lang was originally a stack language that supported a postscript-like syntax. For that reason,
I named it S-Lang, where the S was supposed to emphasize its stack-based nature. About a year
later, I began to work on a preparser that would allow one unfamiliar with stack based languages
to make use of a more traditional inx syntax. Currently, the syntax of the language resembles
C, nevertheless some postscript-like features still remain, e.g., the `%' character is still used as a
comment delimiter.
Acknowledgements
Since I rst released S-Lang, I have received a lot feedback about the library and the language from
many people. This has given me the opportunity and pleasure to interact with a number of people
to make the library portable and easy to use. In particular, I would like to thank the following
individuals:
iii
iv
Luchesar Ionkov for his comments and criticisms of the syntax of the language. He was the person
who made me realize that the low-level byte-code engine should be totally type-independent. He also
improved the tokenizer and preparser and impressed upon me that the language needed a grammar.
Mark Olesen for his many patches to various aspects of the library and his support on AIX. He also
John Burnell for the OS/2 port of the video and keyboard routines. He also made value suggestions
Darrel Hankerson for cleaning up and unifying some of the code and the makeles.
Dominik Wujastyk who was always willing to test new releases of the library.
Hunter Goatley, Andy Harper, Martin P.J. Zinser, and Jouk Jansen for their VMS support.
Dave Sims and Chin Huang for Windows 95 and Windows NT support, and Dino Sangoi for the
I am also grateful to many other people who send in bug-reports, bug-xes, little enhancements, and
suggestions, and so on. Without such community involvement, S-Lang would not be as well-tested
and stable as it is. Finally, I would like to thank my wife for her support and understanding while
1 Introduction 1
1.1 slsh The S-Lang shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.7 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Qualiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.8 Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.4 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.5 Null_Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
v
vi CONTENTS
3.1.6 Ref_Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4 Identiers 25
5 Variables 27
6 Operators 29
6.1 Unary Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7 Statements 35
7.1 Variable Declaration Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
8 Functions 49
8.1 Declaring Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.7 Qualiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
8.8 Exit-Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9 Namespaces 61
10 Arrays 65
10.1 Creating Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
11 Associative Arrays 77
13 Lists 91
14 Error Handling 93
14.1 Traditional Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
16 Modules 103
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
18 slsh 113
18.1 Running slsh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
19 Debugging 117
19.1 Tracebacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
20 Proling 123
20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
B Copyright 137
B.1 The GNU Public License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Introduction
S-Lang is a powerful interpreted language that may be embedded into an application to make
the application extensible. This enables the application to be used in ways not envisioned by the
programmer, thus providing the application with much more exibility and power. Examples of
applications that take advantage of the interpreter in this way include the jed editor and the slrn
newsreader.
podcasts, digital pictures and video, CDs, and so forth. The use of slsh in such non-interactive
slsh also may be used interactively and has full access to all components of the S-Lang interpreter.
With features such as customizable command-line editing, history recall and completion, slsh is a
convenient environment for learning and using the language. In fact, as you are reading this manual,
it is recommended that you use slsh in its interactive mode as an aid to understanding the language.
While a standard S-Lang installation includes slsh, some some binary distributions package slsh
separately from the S-Lang library, and as such must be installed separately. For example, on
When called without arguments, slsh will start in interactive mode by issuing a (customizable)
slsh> prompt and waits for input. While most of the time one would enter S-Lang statements at
the prompt, slsh also accepts some other commands, most notably help:
slsh> help
Most commands must end in a semi-colon.
If a command begins with '!', then the command is passed to the shell.
Examples: !ls, !pwd, !cd foo, ...
1
2 Chapter 1. Introduction
Special commands:
help <help-topic>
apropos <something>
start_log( <optional-log-file> );
start logging input to a file (default is slsh.log)
stop_log();
stop logging input
save_input (<optional-file>);
save all previous input to a file (default: slsh.log)
quit;
Although the language normally requires variables to be declared before use, it is not necessary to
do so when using slsh interactively. For example, in this document you will see examples such as
variable x = [1:10];
variable y = sin (x^2);
At the slsh command line, the use of the variable keyword in such statements is optional:
As the above example suggests, one use of slsh is as a sophisticated calculator. For example,
In this example, the fits module was used to read data from a binary le called evt1a.fits, and
the histogram module was used to bin the data in the energy column into a histogram to create
a spectrum. The expression involving where lters the data by accepting only those energy values
whose status is set to 0. The fits and histogram modules are not distributed with S-Lang but
may be obtained separately see http://www.jedsoft.org/slang/modules/ for links to them. For
more information about modules, see the 16 (Modules) chapter in this document.
For more information about using slsh, see the chapter on 18 (slsh).
functions, structures, datatypes, and arrays. In addition, there is limited support for pointer types.
The concise array syntax rivals that of commercial array-based numerical computing environments.
1.3. Data Types and Operators 3
The language provides built-in support for string, integer (signed and unsigned long and short),
double precision oating point, and double precision complex numbers. In addition, it supports user
dened structure types, multi-dimensional array types, lists, and associative arrays. To facilitate
the construction of sophisticated data structures such as linked lists and trees, the language also
includes a reference type. The reference type provides much of the same exibility as pointers in
other languages. Finally, applications embedding the interpreter may also provide special application
specic types, such as the Mark_Type that the jed editor provides.
The language provides standard arithmetic operations such as addition, subtraction, multiplication,
and division. It also provides support for modulo arithmetic as well as operations at the bit level,
e.g., exclusive-or. Any binary or unary operator may be extended to work with any data type,
including user-dened types. For example, the addition operator (+) has been extended to work
The binary and unary operators work transparently with array types. For example, if a and b are
arrays, then a + b produces an array whose elements are the result of element by element addition of
a and b. This permits one to do vector operations without explicitly looping over the array indices.
The S-Lang language supports several types of looping constructs and conditional statements.
The looping constructs include while, do...while, for, forever, loop, foreach, and _for. The
values are similar to procedures in languages such as PASCAL. The local variables of a function
are always created on a stack allowing one to create recursive functions. Parameters to a function
are always passed by value and never by reference. However, the language supports a reference data
Unlike many interpreted languages, S-Lang allows functions to be dynamically loaded (function
autoloading). It also provides constructs specically designed for error handling and recovery as
Functions and variables may be declared as private belonging to a namespace associated with the
compilation unit that denes the function or variable. The ideas behind the namespace implemen-
tation stem from the C language and should be quite familiar to any one familiar with C.
The S-Lang language has a try/throw/catch/nally exception model whose semantics are similar
to that of other languages. Users may also extend the exception class hierarchy with user-dened
exceptions. The ERROR_BLOCK based exception model of S-Lang 1.x is still supported but deprecated.
4 Chapter 1. Introduction
as strcat, strchop, and strcmp. The S-Lang library also provides mathematical functions such as
sin, cos, and tan; however, not all applications enable the use of these intrinsics. For example, to
conserve memory, the 16 bit version of the jed editor does not provide support for any mathematics
other than simple integer arithmetic, whereas other versions of the editor do support these functions.
Most applications embedding the languages will also provide a set of application specic intrinsic
functions. For example, the jed editor adds over 100 application specic intrinsic functions to the
language. Consult your application specic documentation to see what additional intrinsics are
supported.
Operating systems that support dynamic linking allow a slang interpreter to dynamically link ad-
ditional libraries of intrinsic functions and variables into the interpreter. Such loadable objects are
called modules. A separate chapter of this manual is devoted to this important feature.
1.7 Input/Output
The language supports C-like stdio input/output functions such as fopen, fgets, fputs, and fclose.
In addition it provides two functions, message and error, for writing to the standard output device
and standard error. Specic applications may provide other I/O mechanisms, e.g., the jed editor
Users with generic questions about the interpreter are encouraged to post questions to the Usenet
newsgroup alt.lang.s-lang. More specic questions relating to the use of S-Lang within some
application may be better answered in an application-specic forum. For example, users with ques-
tions about using S-Lang as embedded in the jed editor are more likely to be answered in the
comp.editors newsgroup or on the jed mailing list. Similarly users with questions concerning slrn
will nd news.software.readers to be a valuable source of information.
Developers who have embedded the interpreter are encouraged to join the S-Lang mailing list. To
.
Chapter 2
This purpose of this section is to give the reader a feel for the S-Lang language, its syntax, and its
capabilities. The information and examples presented in this section should be sucient to provide
the reader with the necessary background to understand the rest of the document.
variable x, y, z;
declares three variables, x, y, and z. Note the semicolon at the end of the statement. All S-Lang
statements must end in a semicolon.
Unlike compiled languages such as C, it is not necessary to specify the data type of a S-Lang
variable. The data type of a S-Lang variable is determined upon assignment. For example, after
x = 3;
y = sin (5.6);
z = "I think, therefore I am.";
x will be an integer, y will be a double, and z will be a string. In fact, it is even possible to re-assign
x to a string:
Finally, one can combine variable declarations and assignments in the same statement:
Most functions are declared using the define keyword. A simple example is
5
6 Chapter 2. Overview of the Language
which denes a function that simply computes the average of two numbers and returns the result.
This example shows that a function consists of three parts: the function name, a parameter list, and
The parameter list consists of a comma separated list of variable names. It is not necessary to declare
variables within a parameter list; they are implicitly declared. However, all other local variables used
in the function must be declared. If the function takes no parameters, then the parameter list must
define go_left_5 ()
{
go_left (5);
}
The last example is a function that takes no arguments and returns no value. Some languages
such as PASCAL distinguish such objects from functions that return values by calling these objects
It is not necessary to declare a list of parameters when declaring a function in this way.
Perhaps the most famous example of a recursive function is the factorial function. Here is how to
This example also shows how to mix comments with code. S-Lang uses the `%' character to start a
comment and all characters from the comment character to the end of the line are ignored.
2.2 Qualiers
S-Lang 2.1 introduced support for function qualiers as a mechanism for passing additional infor-
sys_set_color (color);
sys_set_linestyle (linestyle);
sys_plot (x,y);
}
Here the functions sys_set_linestyle, sys_set_color, and sys_plot are hypothetical low-level
functions that perform the actual work. This function may be called simply as
x = [0:10:0.1];
plot (x, sin(x));
to produce a solid black line connecting the points. Through the use of qualiers, the color or
2.3 Strings
Perhaps the most appealing feature of any interpreted language is that it frees the user from the
responsibility of memory management. This is particularly evident when contrasting how S-Lang
handles string variables with a lower level language such as C. Consider a function that concatenates
This function uses the built-in strcat function for concatenating two or more strings. In C, the
exit (1);
strcpy (result, a);
strcat (result, b);
strcat (result, c);
return result;
}
Even this C example is misleading since none of the issues of memory management of the strings
has been dealt with. The S-Lang language hides all these issues from the user.
Binary operators have been dened to work with the string data type. In particular the + operator
may be used to perform string concatenation. That is, one can use the + operator as an alternative
to strcat:
See the section on 3.1.4 (Strings) for more information about string variables.
pointer in other languages. References are commonly used as a mechanism to pass a function as an
s = 0;
for (i = 0; i < 10; i++)
{
s += (@funct)(i);
}
return s;
}
Here, the function compute_functional_sum applies the function specied by the parameter funct
to the rst 10 integers and returns the sum. The two statements following the function denition
Another use of the reference operator is in the context of the fgets function. For example,
2.5. Arrays 9
while (n > 0)
{
if (-1 == fgets (&line, fp))
return NULL;
n--;
}
return line;
}
uses the fgets function to read the nth line of a le. In particular, a reference to the local variable
line is passed to fgets, and upon return line will be set to the character string read by fgets.
Finally, references may be used as an alternative to multiple return values by passing information
back via the parameter list. The example involving fgets presented above provided an illustration
which, after execution, results in X set to 1, Y set to 2, and Z set to 3. A C programmer will note
2.5 Arrays
The S-Lang language supports multi-dimensional arrays of all datatypes. For example, one can
dene arrays of references to functions as well as arrays of arrays. Here are a few examples of
creating arrays:
The rst example creates an array of 10 integers and assigns it to the variable A. The second example
creates a 2-d array of 30 integers arranged in 10 rows and 3 columns and assigns the result to B.
In the last example, an array of 5 integers is assigned to the variable C. However, in this case the
elements of the array are initialized to the values specied. This is known as an inline-array .
S-Lang also supports something called a range-array . An example of such an array is
variable C = [1:9:2];
This will produce an array of 5 integers running from 1 through 9 in increments of 2. Similarly
[0:1:#1000] represents a 1000 element oating point array of numbers running from 0 to 1 (inclu-
sive).
Arrays are passed by reference to functions and never by value. This permits one to write functions
There are more concise ways of accomplishing the result of the previous example. These include:
A = [7, 7, 7, 7, 7, 7, 7, 7, 7, 7];
A = Int_Type [10]; A[[0:9]] = 7;
A = Int_Type [10]; A[*] = 7;
The second and third methods use an array of indices to index the array A. In the second, the range
of indices has been explicitly specied, whereas the third example uses a wildcard form. See chapter
Although the examples have pertained to integer arrays, the fact is that S-Lang arrays can be of
A = Double_Type [10];
B = Complex_Type [10];
C = String_Type [10];
D = Ref_Type [10];
create 10 element arrays of double, complex, string, and reference types, respectively. The last
D[0] = &sin;
D[1] = &cos;
S-Lang arrays also can be of Any_Type. An array of such a type is capable of holding any object,
e.g.,
A = Any_Type [3];
A[0] = 1; A[1] = "string"; A[2] = (1 + 2i);
Dereferencing an Any_Type object returns the actual object. That is, @A[1] produces "string".
The language also denes unary, binary, and mathematical operations on arrays. For example, if A
and B are integer arrays, then A + B is an array whose elements are the sum of the elements of A
and B. A trivial example that illustrates the power of this capability is
variable X, Y;
X = [0:2*PI:0.01];
Y = 20 * sin (X);
n = (2 * PI) / 0.01 + 1;
X = (double *) malloc (n * sizeof (double));
Y = (double *) malloc (n * sizeof (double));
for (i = 0; i < n; i++)
{
X[i] = i * 0.01;
Y[i] = 20 * sin (X[i]);
}
2.6 Lists
A S-Lang list is like an array except that it may contain a heterogeneous collection of data, e.g.,
is a list of four objects, each with a dierent type. Like an array, the elements of a list may be
accessed via an index, e.g., x=my_list[2] will result in the assignment of "foo" to x. The most
important dierence between an array and a list is that an array's size is xed whereas a list may
grow or shrink. Algorithms that require such a data structure may execute many times faster when
of an array must all be of the same type (or of Any_Type), whereas a structure is heterogeneous. As
12 Chapter 2. Overview of the Language
an example, consider
In this example a structure consisting of the three elds has been created and assigned to the variable
person. Then an instance of this structure has been created using the dereference operator and
assigned to bill. Finally, the individual elds of bill were initialized. This is an example of an
anonymous structure.
Note: S-Lang versions 2.1 and higher permit assignment statements within the structure denition,
e.g.,
A named structure is really a new data type and may be created using the typedef keyword:
typedef struct
{
first_name, last_name, age
}
Person_Type;
One advantage of creating a new type is that array elements of such types are automatically initialized
may be used to create an array of 100 such objects and initialize the first_name elds of the rst
two elements. In contrast, the form using an anonymous would require a separate step to instantiate
Another big advantage of a user-dened type is that the binary and unary operators may be over-
Other common uses of structures is the creation of linked lists, binary trees, etc. For more information
about these and other features of structures, see the section on 12.3 (Linked Lists).
2.8 Namespaces
The language supports namespaces that may be used to control the scope and visibility of variables
and functions. In addition to the global or public namespace, each S-Lang source le or compilation
unit has a private or anonymous namespace associated with it. The private namespace may be used
to dene symbols that are local to the compilation unit and inaccessible from the outside. The
language also allows the creation of named (non-anonymous or static) namespaces that permit
access via the namespace operator. See the chapter on 9 (Namespaces) for more information.
14 Chapter 2. Overview of the Language
Chapter 3
The current implementation of the S-Lang language permits up to 65535 distinct data types, includ-
ing predened data types such as integer and oating point, as well as specialized application-specic
data types. It is also possible to create new data types in the language using the typedef mechanism.
Literal constants are objects such as the integer 3 or the string "hello". The actual data type given
to a literal constant depends upon the syntax of the constant. The following sections describe the
denes special purpose data types such as Null_Type, DataType_Type, and Ref_Type. These types
3.1.1 Integers
The S-Lang language supports both signed and unsigned characters, short integer, long integer,
and long long integer types. On most 32 bit systems, there is no dierence between an integer and
a long integer; however, they may dier on 16 and 64 bit systems. Generally speaking, on a 16 bit
system, plain integers are 16 bit quantities with a range of -32767 to 32767. On a 32 bit system,
• As a decimal (base 10) integer consisting of the characters 0 through 9, e.g., 127. An integer
specied this way cannot begin with a leading 0. That is, 0127 is not the same as 127.
• Using hexadecimal (base 16) notation consisting of the characters 0 to 9 and A through F. The
hexadecimal number must be preceded by the characters 0x. For example, 0x7F species an
integer using hexadecimal notation and has the same value as decimal 127.
• In Octal notation using characters 0 through 7. The Octal number must begin with a leading
15
16 Chapter 3. Data Types and Literal Constants
• In Binary notation using characters 0 and 1 with the 0b prex. For example, 21 may be
Short, long, long long, and unsigned types may be specied by using the proper suxes: L indicates
that the integer is a long integer, LL indicates a long long integer, h indicates that the integer is
a short integer, and U indicates that it is unsigned. For example, 1UL species an unsigned long
integer.
Finally, a character literal may be specied using a notation containing a character enclosed in single
quotes as 'a'. The value of the character specied this way will lie in the range 0 to 256 and will
i = '0';
assigns to i the character 48 since the '0' character has an ASCII value of 48.
A wide character (unicode) may be specied using the form '\x{y...y}' where y...y are hexadecimal
digits. For example,
Any integer may be preceded by a minus sign to indicate that it is a negative integer.
(or both). Here are examples of specifying the same double precision point number:
Note that 12 is not a oating point number since it contains neither a decimal point nor an exponent.
In fact, 12 is an integer.
One may append the f character to the end of the number to indicate that the number is a single
The rst number in the pair forms the real part, while the second number forms the imaginary part.
That is, a complex number may be regarded as the sum of a real number and an imaginary number.
Strictly speaking, the current implementation of the S-Lang does not support generic complex
literals. However, it does support imaginary literals permitting a more generic complex number
with a non-zero real part to be constructed from the imaginary literal via addition of a real number.
An imaginary literal is specied in the same way as a oating point literal except that i or j is
A more generic complex number may be constructed from an imaginary literal via addition, e.g.,
3.0 + 4.0i
produces a complex number whose real part is 3.0 and whose imaginary part is 4.0.
The intrinsic functions Real and Imag may be used to retrieve the real and imaginary parts of a
3.1.4 Strings
A string literal must be enclosed in double quotes as in:
"This is a string".
As described below, the string literal may contain a sux that species how the string is to be
"$HOME/.jedrc"$
Although there is no imposed limit on the length of a string, single-line string literals must be
less than 256 characters in length. It is possible to construct strings longer than this by string
concatenation, e.g.,
S-Lang version 2.2 introduced support for multi-line string literals. There are basic variants sup-
ported. The rst makes use of the backslash at the end of a line to indicate that the string is
"This is a \
multi-line string. \
Note the presence of the \
backslash character at the end \
of each of the lines."
The second form of multiline string is delimited by the backquote character (`) and does not require
backslashes:
Note that if a backquote is to appear in such a string, then it must be doubled, as illustrated in the
above example.
Any character except a newline (ASCII 10) or the null character (ASCII 0) may appear explicitly in
a string literal. However, these characters may embedded implicitly using the mechanism described
below.
The backslash character is a special character and is used to include other special characters (such
In the above table, h represents one of the HEXADECIMAL characters from the set [0-9A-Fa-f] .
It is important to understand the distinction between the \x{h..h} and \u{h..h} forms. When
using in a string, the \u form always expands to the corresponding UTF-8 sequence regardless of
the UTF-8 mode. In contrast, when in non-UTF-8 mode, the \x form expands to a byte when given
two hex characters, or to the corresponding UTF-8 sequence when used with three or more hex
characters.
For example, to include the double quote character as part of the string, it must be preceded by a
"This is a \"quote\"."
Similarly, the next example illustrates how a newline character may be included:
`This is a "quote".`
`This is the first line
and this is the second.`
Suxes
A string literal may be contain a sux that species how the string is to be interpreted. The sux
R
Backslash substitution will not be performed on the string. This is the default when using
back-quoted strings.
Q
Backslash substitution will be performed on the string. This is the default when using strings
B
If this sux is present, the string will be interpreted as a binary string (BString_Type).
$
Variable name substitution will be performed on the string.
Not all combinations of the above controls characters are supported, nor make sense. For example,
a string with the sux QR will cause a parse-error because Q and R have opposing meanings.
The Q and R suxes These suxes turn on and o backslash expansion. Unless the R sux
is present, all double-quoted string literals will have backslash substitution performed. By default,
Sometimes it is desirable to turn o backslash expansion for double-quoted strings. For example,
pathnames on an MSDOS or Windows system use the backslash character as a path separator. The
R prex turns o backslash expansion, and as a result the following statements are equivalent:
file = "C:\\windows\\apps\\slrn.rc";
file = "C:\\windows\\apps\\slrn.rc"Q;
file = "C:\windows\apps\slrn.rc"R;
file = `C:\windows\apps\slrn.rc`; % slang-2.2 and above
The only exception is that a backslash character is not permitted as the last character of a string
The $ sux If the string contains the $ sux, then variable name expansion will be performed
with variable name substitution to be performed on the names X and Y. Such strings may be used
Name expansion is carried out according to the following rules: If the string literal occurs in a
function, and the name corresponds to a variable local to the function, then the string representation
of the value of that variable will be substituted. Otherwise, if the name corresponds to a variable
that is local to the compilation unit (i.e., is declared as static or private), then its value's string
representation will be used. Otherwise, if the name corresponds to a variable that exists as a global
(public) then its value's string representation will be substituted. If the above searches fail and the
name exists in the environment, then the value of the corresponding environment variable will be
"${MYHOME}/foo: bar=${bar}"$
This is useful in cases when the name is followed immediately by other characters that may be
3.1.5 Null_Type
Objects of type Null_Type can have only one value: NULL. About the only thing that you can do
with this data type is to assign it to variables and test for equality with other objects. Nevertheless,
Null_Type is an important and extremely useful data type. Its main use stems from the fact that
since it can be compared for equality with any other data type, it is ideal to represent the value of
an object which does not yet have a value, or has an illegal value.
It should be clear that after these statements have been executed, c will have a value of 3. It should
also be clear that d will have a value of 1 because NULL has been passed as the second parameter.
One feature of the language is that if a parameter has been omitted from a function call, the variable
associated with that parameter will be set to NULL. Hence, e and f will be set to 1 and 0, respectively.
The Null_Type data type also plays an important role in the context of structures .
3.1.6 Ref_Type
Objects of Ref_Type are created using the unary reference operator &. Such objects may be deref-
erenced using the dereference operator @. For example,
sin_ref = &sin;
y = (@sin_ref) (1.0);
creates a reference to the sin function and assigns it to sin_ref. The second statement uses the
The Ref_Type is useful for passing functions as arguments to other functions, or for returning
information from a function via its parameter list. The dereference operator may also used to create
an instance of a structure. For these reasons, further discussion of this important type can be found
special syntax. For these reasons they are discussed in a separate chapters.
For example, an integer is an object of type Integer_Type. The literals of DataType_Type include:
The built-in function typeof returns the data type of its argument, i.e., a DataType_Type. For
instance typeof(7) returnsInteger_Type and typeof(Integer_Type) returns DataType_Type.
One can use this function as in the following example:
The literals of DataType_Type have other uses as well. One of the most common uses of these literals
x = Complex_Type [100];
Char_Type objects. In particular, boolean FALSE is equivalent to Char_Type 0, and TRUE as any
non-zero Char_Type value. Since the exact value of TRUE is unspecied, it is unnecessary and even
variable x = 10, y;
y = typecast (x, Double_Type);
After execution of these statements, x will have the integer value 10 and y will have the double
precision oating point value 10.0. If the object to be converted is an array, the typecast function
will act upon all elements of the array. For example,
will create an array of 10 double precision values and assign it to y. One should also realize that it is
not always possible to perform a typecast. For example, any attempt to convert an Integer_Type
to a Null_Type will result in a run-time error. Typecasting works only when datatypes are similar.
Often the interpreter will perform implicit type conversions as necessary to complete calcula-
tions. For example, when multiplying an Integer_Type with a Double_Type, it will convert the
Integer_Type to a Double_Type for the purpose of the calculation. Thus, the example involving the
conversion of an array of integers to an array of doubles could have been performed by multiplication
by 1.0, i.e.,
The string intrinsic function should be used whenever a string representation is needed. Using
the typecast function for this purpose will usually fail unless the object to be converted is similar
to a string most are not. Moreover, when typecasting an array to String_Type, the typecast
function acts on each element of the array to produce another array, whereas the string function
will produce a string.
One use of string function is to print the value of an object. This use is illustrated in the following
simple example:
function was not used and the message function was passed an integer, a type-mismatch error would
have resulted.
24 Chapter 3. Data Types and Literal Constants
Chapter 4
Identiers
The names given to variables, functions, and data types are called identiers . There are some
restrictions upon the actual characters that make up an identier. An identier name must start
with an alphabetic character ([A-Za-z]), an underscore character, or a dollar sign. The rest of
the characters in the name can be any combination of letters, digits, dollar signs, or underscore
characters. However, all identiers whose name begins with two underscore characters are reserved
for internal use by the interpreter and declarations of objects with such names should be avoided.
mary _3 _this_is_ok
a7e1 $44 _44$_Three
The following identiers are reserved by the language for use as keywords:
25
26 Chapter 4. Identiers
Chapter 5
Variables
As many of the preceding examples have shown, a variable must be declared before it can be used,
otherwise an undened name error will be generated. A variable is declared using the variable
keyword, e.g,
variable x, y, z;
declares three variables, x, y, and z. This is an example of a variable declaration statement, and
Variables declared this way are untyped and inherit a type upon assignment. As such, type-checking
x = "This is a string";
x = 1.2;
x = 3;
x = 2i;
results in x being set successively to a string, a oat, an integer, and to a complex number (0+2i).
Any attempt to use a variable before it has acquired a type will result in an uninitialized variable
error.
are legal variable declarations. This also provides a convenient way of initializing a variable.
Variables are classied as either global or local . A variable declared inside a function is said to be
local and has no meaning outside the function. A variable is said to be global if it was declared
outside a function. Global variables are further classied as being public, static, or private,
according to the namespace where they were dened. See the chapter on 9 (Namespaces) for more
The following global variables are predened by the language and live in the public namespace.
27
28 Chapter 5. Variables
$0 $1 $2 $3 $4 $5 $6 $7 $8 $9
An intrinsic variable is another type of global variable. Such variables have a denite type which
cannot be altered. Variables of this type may also be dened to be read-only, or constant variables.
An example of an intrinsic variable isPI which is a read-only double precision variable with a value
of approximately 3.14159265358979323846.
Chapter 6
Operators
S-Lang supports a variety of operators that are grouped into three classes: assignment operators,
An assignment operator is used to assign a value to a variable. They will be discussed more fully in
the context of the assignment statement in the section on 7.2 (Assignment Statements).
An unary operator acts only upon a single quantity while a binary operation is an operation between
two quantities. The boolean operator not is an example of an unary operator. Examples of binary
operators include the usual arithmetic operators +, -, *, and /. The operator given by - can be either
an unary operator (negation) or a binary operator (subtraction); the actual operation is determined
Binary operators are used in algebraic forms, e.g., a + b. Unary operators fall into one of two classes:
postx-unary or prex-unary. For example, in the expression -x, the minus sign is a prex-unary
operator.
All binary and unary operators may be dened for any supported data type. For example, the
arithmetic plus operator has been extended to the String_Type data type to permit concatenation
between strings. But just because it is possible to dene the action of an operator upon a data type,
it does not mean that all data types support all the binary and unary operators. For example, while
The boolean operator not acts only upon integers and produces 0 if its operand is non-zero, otherwise
it produces 1.
The bit-level not operator performs a similar function, except that it operates on the individual
The arithmetic negation operator - is perhaps the most well-known unary operator. It simply
29
30 Chapter 6. Operators
The reference (&) and dereference (@) operators will be discussed in greater detail in the section
on 8.5 (Referencing Variables). Similarly, the increment (++) and decrement () operators will be
The data type of the result produced by the use of one of these operators depends upon the data
types of the binary participants. If they are both integers, the result will be an integer. However, if
the operands are not of the same type, they will be converted to a common type before the operation
is performed. For example, if one is a oating point type and the other is an integer, the integer
will be converted to a oat. In general, the promotion from one type to another is such that no
information is lost, if possible. As an example, consider the expression 8/5 which indicates division
of the integer 8 by the integer 5. The result will be the integer 1 and not the oating point value
1.6. However, 8/5.0 will produce 1.6 because 5.0 is a oating point number.
greater than or equal, less than, less than or equal, equal, and not equal, respectively. For most data
types, the result of the comparison will be a boolean value; however, for arrays the result will be an
array of boolean values. The section on arrays will explain this is greater detail.
Note: For S-Lang versions 2.1 and higher, relational expressions such as a<b<=c are dened in the
mathematical sense, i.e.,
and so on. In previous versions of S-Lang, (a<b<c) meant (a<b)<c; however this interpretation
TRUE) if either of their operands are non-zero, otherwise they produce zero (boolean FALSE). The
and and && operators produce a non-zero value if and only if both their operands are non-zero,
Unlike the operators && and ||, the and and or operators do not perform the so-called boolean
Here, ifx were to have a value of zero, a division by zero error would occur because even though
x!=0 evaluates to zero, the and operator is not short-circuited and the 1/x expression would still be
evaluated. This problem can be avoided using the short-circuiting && operator:
Another dierence between the short-circuiting (&&,||) and the non-short-circuiting operators
(and,or) is that the short-circuiting forms work only with integer or boolean types. In contrast, if
either of the operands of the and or or operators is an array then a corresponding array of boolean
values will result. This is explained in more detail in the section on arrays.
Note: the short-circuiting operators && and || were rst introduced in S-Lang 2.1; they are not
operations. Operators that fall in this class include &, |, shl, shr, and xor.
& operator performs The
a boolean AND operation between the corresponding bits of the operands. Similarly, the | operator
performs the boolean OR operation on the bits. The bit-shifting operators shl and shr shift the
bits of the rst operand by the number given by the second operand to the left or right, respectively.
These operators are commonly used to manipulate variables whose individual bits have distinct
meanings. In particular, & is usually used to test bits, | can be used to set bits, and xor may be
As an example of using & to perform tests on bits, consider the following: The jed text editor
stores some of the information about a buer in a bitmapped integer variable. The value of this
variable may be retrieved using the jed intrinsic function getbuf_info, which actually returns four
quantities: the buer ags, the name of the buer, directory name, and le name. For the purposes
of this section, only the buer ags are of interest and can be retrieved via a function such as
define get_buffer_flags ()
{
variable flags;
(,,,flags) = getbuf_info ();
return flags;
}
The buer ags object is a bitmapped quantity where the 0th bit indicates whether or not the buer
has been modied, the rst bit indicates whether or not autosave has been enabled for the buer,
32 Chapter 6. Operators
and so on. Consider for the moment the task of determining if the buer has been modied. This
can be determined by looking at the zeroth bit: if it is 0 the buer has not been modied, otherwise
define is_buffer_modified ()
{
variable flags = get_buffer_flags ();
return (flags & 1);
}
where the integer 1 has been used since it is represented as an object with all bits unset, except for
the zeroth one, which is set. (At this point, it should also be apparent that bits are numbered from
zero, thus an 8 bit integer consists of bits 0 to 7, where 0 is the least signicant bit and 7 is the most
define is_autosave_on ()
{
variable flags = get_buffer_flags ();
return (flags & 2);
}
to determine whether or not autosave has been turned on for the buer.
The shl operator may be used to form the integer with only the nth bit set. For example, 1 shl 6
produces an integer with all bits set to zero except the sixth bit, which is set to one. The following
In fact, the current implementation of the S-Lang language will produce incorrect results if both
operands of a binary expression return multiple values. At most, only one of operands of a binary
expression can return multiple values, and that operand must be the rst one, not the second. For
example,
6.3. Mixing Integer and Floating Point Arithmetic 33
denes a function, read_line that takes a single argument specifying a handle to an open le, and
returns one or two values, depending upon the return value of fgets. Now consider
Here the relational binary operator > forms a comparison between one of the return values (the one
at the top of the stack) and 0. In accordance with the above rule, since read_line returns multiple
values, it must occur as the left binary operand. Putting it on the right as in
violates the rule and will result in the wrong answer. For this reason, one should avoid using a
one of the operands is a oating point value, the other will be converted to a oating point value,
11 / 2 --> 5 (integer)
11 / 2.0 --> 5.5 (double)
11.0 / 2 --> 5.5 (double)
11.0 / 2.0 --> 5.5 (double)
Sometimes to achive the desired result, it is necessary to explicitly convert from one data type to
another. For example, suppose that a and b are integers, and that one wants to compute a/b using
34 Chapter 6. Operators
oating point arithmetic. In such a case, it is necessary to convert at least one of the operands to a
x = a/double(b);
A similar syntax holds for the orelse operator. For example, consider the statement:
Here, if x were to have a value of zero, a division by zero error would occur because even though
x!=0 evaluates to zero, the and operator is not short circuited and the 1/x expression would be
evaluated causing division by zero. For this case, the andelse expression could be used to avoid the
problem:
if (andelse
{x != 0}
{1 / x > 10}) do_something ();
Chapter 7
Statements
Loosely speaking, a statement is composed of expressions that are grouped according to the syntax
or grammar of the language to express a complete computation. A semicolon is used to denote the
end of a statement.
A statement that occurs within a function is executed only during execution of the function. How-
ever, statements that occur outside the context of a function are evaluated immediately.
The language supports several dierent types of statements such as assignment statements, condi-
tional statements, and so forth. These are described in detail in the following sections.
variable variable-declaration-list ;
where the variable-declaration-list is a comma separated list of one or more variable names with
variable x, y = 2, z;
type consist of a left-hand side, an assignment operator, and a right-hand side. The left-hand side
must be something to which an assignment can be performed. Such an object is called an lvalue .
The most common assignment operator is the simple assignment operator =. Examples of its use
include
x = 3;
35
36 Chapter 7. Statements
x = some_function (10);
x = 34 + 27/y + some_function (z);
x = x + 3;
In addition to the simple assignment operator, S-Lang also supports the binary assignment opera-
tors:
+= -= *= /= &= |=
a += b;
to
a = a + b;
It is extremely important to realize that, in general, a+b is not equal to b+a. For example if a and b
are strings, then a+b will be the string resulting from the concatenation of a and b, which generally
is not he same as the concatenation of b with a. This means that a+=b may not be the same as
a=b+a, as the following example illustrates:
a = "hello"; b = "world";
a += b; % a will become "helloworld"
c = b + a; % c will become "worldhelloworld"
Since adding or subtracting 1 from a variable is quite common, S-Lang also supports the unary
increment and decrement operators ++, and , respectively. That is, for numeric data types,
x = x + 1;
x += 1;
x++;
x = x - 1;
x -= 1;
x--;
Strictly speaking, ++ and are unary operators. When used as x++, the ++ operator is said to be
a postx-unary operator. However, when used as ++x it is said to be a prex-unary operator. The
current implementation does not distinguish between the two forms, thus x++ and ++x are equivalent.
The reason for this equivalence is that assignment expressions do not return a value in the S-Lang
language as they do in C. Thus one should exercise care and not try to write C-like code such as
x = 10;
while (--x) do_something (x); % Ok in C, but not in S-Lang
7.3. Conditional and Looping Statements 37
x = 10;
while (x--, x) do_something (x); % Ok in S-Lang and in C
If integer-expression evaluates to a non-zero (boolean TRUE) result, then the statement or group
of statements implied statement-or-block will get executed. Otherwise, control will proceed to next-
statement .
An example of the use of this type of conditional statement is
if (x != 0)
{
y = 1.0 / x;
if (x > 0) z = log (x);
}
This example illustrates two if statements where the second if statement is part of the block of
if-else
Here, ifexpression evaluates to a non-zero integer, statement-or-block-1 will get executed and control
will pass on to next-statement . However, if expression evaluates to zero, statement-or-block-2 will
get executed before continuing on to next-statement . A simple example of this form is
if (x > 0)
z = log (x);
else
throw DomainError, "x must be positive";
if (city == "Boston")
if (street == "Beacon") found = 1;
else if (city == "Madrid")
if (street == "Calle Mayor") found = 1;
else found = 0;
This example illustrates a problem that beginners have with if-else statements. Syntactically, this
example is equivalent to
if (city == "Boston")
{
if (street == "Beacon") found = 1;
else if (city == "Madrid")
{
if (street == "Calle Mayor") found = 1;
else found = 0;
}
}
although the indentation indicates otherwise. It is important to understand the grammar and not
ifnot
if (integer-expression == 0) statement-or-block
or equivalently,
if (not(integer-expression )) statement-or-block
The ifnot statement was added to the language to simplify the handling of such statements. It
Note: The ifnot keyword was added in version 2.1 and is not supported by earlier versions. For
compatibility with older code, the !if keyword can be used, although its use is deprecated in favor
of ifnot.
orelse, andelse
As of S-Lang version 2.1, use of the andelse and orelse have been deprecated in favor
of the && and || short-circuiting operators.
The syntax for the orelse statement is:
This causes each of the blocks to be executed in turn until one of them returns a non-zero integer
value. The result of this statement is the integer value returned by the last block executed. For
example,
orelse { 0 } { 6 } { 2 } { 3 }
returns 6 since the second block is the rst to return a non-zero result. The last two block will not
get executed.
Each of the blocks will be executed in turn until one of them returns a zero value. The result of this
statement is the integer value returned by the last block executed. For example,
andelse { 6 } { 2 } { 0 } { 4 }
switch
The switch statement deviates from its C counterpart. The syntax is:
switch (x)
{ ... : ...}
.
.
{ ... : ...}
The `:' operator is a special symbol that in the context of the switch statement, causes the top item
on the stack to be tested, and if it is non-zero, the rest of the block will get executed and control
will pass out of the switch statement. Otherwise, the execution of the block will be terminated and
the process will be repeated for the next block. If a block contains no : operator, the entire block
40 Chapter 7. Statements
is executed and control will pass onto the next statement following the switch statement. Such a
switch (x)
{ x == 1 : message("Number is one.");}
{ x == 2 : message("Number is two.");}
{ x == 3 : message("Number is three.");}
{ x == 4 : message("Number is four.");}
{ x == 5 : message("Number is five.");}
{ message ("Number is greater than five.");}
Suppose x has an integer value of 3. The rst two blocks will terminate at the `:' character
because each of the comparisons with x will produce zero. However, the third block will execute to
A more familiar way to write the previous example is to make use of the case keyword:
switch (x)
{ case 1 : message("Number is one.");}
{ case 2 : message("Number is two.");}
{ case 3 : message("Number is three.");}
{ case 4 : message("Number is four.");}
{ case 5 : message("Number is five.");}
{ message ("Number is greater than five.");}
The case keyword is a more useful comparison operator because it can perform a comparison between
dierent data types while using == may result in a type-mismatch error. For example,
switch (x)
{ (x == 1) or (x == "one") : message("Number is one.");}
{ (x == 2) or (x == "two") : message("Number is two.");}
{ (x == 3) or (x == "three") : message("Number is three.");}
{ (x == 4) or (x == "four") : message("Number is four.");}
{ (x == 5) or (x == "five") : message("Number is five.");}
{ message ("Number is greater than five.");}
will fail because the == operation is not dened between strings and integers. The correct way to
switch (x)
{ case 1 or case "one" : message("Number is one.");}
{ case 2 or case "two" : message("Number is two.");}
{ case 3 or case "three" : message("Number is three.");}
{ case 4 or case "four" : message("Number is four.");}
{ case 5 or case "five" : message("Number is five.");}
{ message ("Number is greater than five.");}
7.3. Conditional and Looping Statements 41
while
i = 10;
while (i)
{
i--;
newline ();
}
i = -10;
while (i)
{
i--;
newline ();
}
would loop forever (or until i wraps from the most negative integer value to the most positive and
If you are a C programmer, do not let the syntax of the language seduce you into writing this
i = 10;
while (i--) newline ();
Keep in mind that expressions such as i do not return a value in S-Lang as they do in C. The
i = 10;
while (i, i--) newline ();
do...while
The main dierence between this statement and the while statement is that the do...while form
performs the test involving integer-expression after each execution of statement-or-block rather than
before. This guarantees that statement-or-block will get executed at least once.
A simple example from the jed editor follows:
This will cause all lines in the buer to get indented via the jed intrinsic function indent_line.
for
Perhaps the most complex looping statement is the for statement; nevertheless, it is a favorite of
In addition to statement-or-block , its specication requires three other expressions. When executed,
the for statement evaluates init-expression , then it tests integer-expression . If integer-expression
The reason that they are not fully equivalent involves what happens when statement-or-block con-
Despite the apparent complexity of the for statement, it is very easy to use. As an example, consider
s = 0;
for (i = 1; i <= 10; i++) s += i;
loop
The loop statement simply executes a block of code a xed number of times. It follows the syntax
7.3. Conditional and Looping Statements 43
If the integer-expression evaluates to a positive integer, statement-or-block will get executed that
_for
Like loop, the _for statement simply executes a block of code a xed number times. Unlike the
loop statement, the _for loop is useful in situations where the loop index is needed. It obeys the
syntax
Each time through the loop, the loop-variable will take on the successive values dictated by the other
parameters. The rst time through, the loop-variable will have the value of rst-value . The second
time its value will be rst-value + increment , and so on. The loop will terminate when the value
of the loop index exceeds last-value . The current implementation requires the control parameters
s = 0;
_for i (1, 10, 1)
s += i;
The execution speed of the _for loop is more than twice as fast as the more powerful for loop
forever
The forever statement is similar to the loop statement except that it loops forever, or until a break
or a return statement is executed. It obeys the syntax
n = 10;
forever
{
if (n == 0) break;
newline ();
n--;
}
44 Chapter 7. Statements
foreach
The foreach statement is used to loop over one or more statements for every element of an object.
Most often the object will be a container object such as an array, structure, or associative arrays,
Here object can be an expression that evaluates to a value. Each time through the loop the variable
var will take on a value that depends upon the data type of the object being processed. For container
A simple example is
This example shows that if the container object is an array, then successive elements of the array are
assigned to fruit prior to each execution cycle. If the container object is a string, then successive
What actually gets assigned to the variable may be controlled via the using form of the foreach
statement. This more complex type of foreach statement follows the syntax
The allowed values of control-list will depend upon the type of container object. For associative
arrays (Assoc_Type), control-list species whether keys , values , or both are used. For example,
results in the keys of the associative array a being successively assigned to k. Similarly,
Similarly, for linked-lists of structures, one may walk the list via code like
s = linked_list;
while (s != NULL)
{
.
.
s = s.next;
}
Consult the type-specic documentation for a discussion of the using control words, if any, appro-
statement causes control to return to the calling function while the break and continue statements
are used in the context of loop structures. Consider:
define fun ()
{
forever
{
s1;
s2;
..
if (condition_1) break;
if (condition_2) return;
if (condition_3) continue;
..
s3;
}
s4;
..
}
Here, a function fun has been dened that contains a forever loop consisting of statements s1,
s2,...,s3, and three if statements. As long as the expressions condition_1, condition_2, and
condition_3 evaluate to zero, the statements s1, s2,...,s3 will be repeatedly executed. However, if
condition_1 returns a non-zero value, the break statement will get executed, and control will pass
46 Chapter 7. Statements
out of the forever loop to the statement immediately following the loop, which in this case is s4.
Similarly, if condition_2 returns a non-zero number, the return statement will cause control to
pass back to the caller of fun. Finally, the continue statement will cause control to pass back to
that comprise this clause get executed only when the loop has run to completion and was not
count = 0;
max_tries = 20;
while (count < max_tries)
{
if (try_something ())
break;
count++;
% Failed -- try again
}
if (count == 20)
throw RunTimeError, "try_something failed 20 times";
Here, the code makes 20 attempts to perform some task (via the try_something function) and if
not successful it will throw an exception. Compare the above to an equivalent form that makes use
max_tries = 20;
loop (max_tries)
{
if (try_something ())
break;
% Failed -- try again
}
then throw RunTimeError, "try_something failed 20 times";
Here, the then statement would get executed only if the loop statement has run to completion,
i.e., loops 20 times in this case. This only happens if the try_something function fails each time
through the loop. However, if the try_something function succeeds, then the break statement
will get executed causing the loop to abort prematurely, which would result in the then clause not
getting executed.
The use of such a construct can also simplify code such as:
if (some_condition)
{
foo_statements;
if (another_condition)
bar_statements;
7.3. Conditional and Looping Statements 47
else
fizzle_statements;
}
else fizzle_statements;
In this case the fizzle_statements are duplicated making the code ugly and less maintainable.
Ideally one would wrap the fizzle_statements in a separate function and call it twice. However,
this is not always possible or convenient. The duplication can be eliminated by using the then form
of the loop statement:
loop (some_condition != 0)
{
foo_statements;
if (another_condition)
{
bar_statements;
break;
}
}
then fizzle_statements;
Here, the expression some_condition != 0 is going to result in either 0 or 1, causing the code
to execute 0 or 1 loops. Since the fizzle_statements are contained in the then clause, they
will get executed only when the requested number of loops executes to completion. Executing 0
loops is regarded as successful completion of the loop statement. Hence, when some_condition
is 0, the fizzle_statements will get executed. The fizzle_statements will not get executed
only when the loop is prematurely terminated, and that will occur when both some_condition and
another_condition are non-zero.
48 Chapter 7. Statements
Chapter 8
Functions
There are essentially two classes of functions that may be called from the interpreter: intrinsic
An intrinsic function is one that is implemented in C or some other compiled language and is callable
from the interpreter. Nearly all of the built-in functions are of this variety. At the moment the basic
interpreter provides nearly 300 intrinsic functions. Examples include the trigonometric functions
sin and cos, string functions such as strcat, etc. Dynamically loaded modules such as the png and
pcre modules add additional intrinsic functions.
The other type of function is written in S-Lang and is known simply as a S-Lang function. Such
a function may be thought of as a group of statements that work together to perform a computation.
is sucient to declare a function named factorial. Unlike the variable keyword used for declaring
variables, the define keyword does not accept a list of names.
Usually, the above form is used only for recursive functions. In most cases, the function name is
almost always followed by a parameter list and the body of the function:
The function-name is an identier and must conform to the naming scheme for identiers discussed
in the chapter on 4 (Identiers). The parameter-list is a comma-separated list of variable names that
represent parameters passed to the function, and may be empty if no parameters are to be passed.
The variables in the parameter-list are implicitly declared, thus, there is no need to declare them
via a variable declaration statement. In fact any attempt to do so will result in a syntax error.
49
50 Chapter 8. Functions
statement-
The body of the function is enclosed in braces and consists of zero or more statements (
list ). While there are no imposed limits upon the number statements that may occur within a S-
Lang function, it is considered poor programming practice if a function contains many statements.
This notion stems from the belief that a function should have a simple, well-dened purpose.
consider
Here a function add_10 has been dened, which when executed, adds 10 to its parameter. A variable
b has also been declared and initialized to zero before being passed toadd_10. What will be the
value of b after the call to add_10? If S-Lang were a language that passed parameters by reference,
the value of b would be changed to 10. However, S-Lang always passes by value, which means that
b will retain its value during and after after the function call.
S-Lang does provide a mechanism for simulating pass by reference via the reference operator. This
If a function is called with a parameter in the parameter list omitted, the corresponding variable in
the function will be set to NULL. To make this clear, consider the function
This function must be called with two parameters. However, either of them may omitted by calling
The rst example calls the function using both parameters, but at least one of the parameters was
omitted in the other examples. If the parser recognizes that a parameter has been omitted by nding
a comma or right-parenthesis where a value is expected, it will substitute NULL for missing value.
This means that the parser will convert the latter three statements in the above example to:
8.3. Returning Values 51
It is important to note that this mechanism is available only for function calls that specify more
is not equivalent to add_10(NULL). The reason for this is simple: the parser can only tell whether or
not NULL should be substituted by looking at the position of the comma character in the parameter
list, and only function calls that indicate more than one parameter will use a comma. A mechanism
for handling single parameter function calls is described later in this chapter.
return expression-list ;
where expression-list is a comma separated list of expressions. If a function does not return any
values, the expression list will be empty. A simple example of a function that can return multiple
sum = x + y; diff = x - y;
return sum, diff;
}
After the above line is executed, s will have a value of 17 and the value of d will be 7.
Here expression is an arbitrary expression that leaves n items on the stack, and var_k represents
an l-value object (permits assignment). The assignment statement removes those values and assigns
them to the specied variables. Usually, expression is a call to a function that returns multiple
produces results that are equivalent to the call to the sum_and_diff function. Another common use
(x,y) = (y,x);
(a[i], a[j], a[k]) = (a[j], a[k], a[i]);
If an l-value is omitted from the list, then the corresponding value will be removed fro the stack.
For example,
assigns the sum of 9 and 4 to s and the dierence (9-4) is removed from the stack. Similarly,
It is possible to create functions that return a variable number of values instead of a xed number .
Although such functions are discouraged, it is easy to cope with them. Usually, the value at the
top of the stack will indicate the actual number of return values. For such functions, the multiple
assignment statement cannot directly be used. To see how such functions can be dealt with, consider
This function returns either one or two values, depending upon the return value of fgets. Such a
s = ();
.
.
}
In this example, the last value returned by read_line is assigned to status and then tested. If it
is non-zero, the second return value is assigned to s. In particular note the empty set of parenthesis
in the assignment to s. This simply indicates that whatever is on the top of the stack when the
operators. Consider again the add_10 function presented in the previous section. This time it is
written as:
The expression &b creates a reference to the variable b and it is the reference that gets passed to
add_10. When the function add_10 is called, the value of the local variable a will be a reference
to the variable b. It is only by dereferencing this value that b can be accessed and changed. So,
the statement @a=@a+10 should be read as add 10 to the value of the object that a references and
The reader familiar with C will note the similarity between references in S-Lang and pointers in C.
References are not limited to variables. A reference to a function may also be created and passed
to other functions. As a simple example from elementary calculus, consider the following function
When the derivative function is called, the local variable f will be a reference to the x_squared
function. The x_squared function is called with the specied parameters by dereferencing f with
such functions is the strcat function, which takes one or more string arguments and returns the
concatenated result. An example of dierent sort is the strtrim function which moves both leading
and trailing whitespace from a string. In this case, when called with one argument (the string to be
trimmed), the characters that are considered to be whitespace are those in the character-set that
have the whitespace property (space, tab, newline, ...). However, when called with two arguments,
the second argument may be used to specify the characters that are to be considered as whitespace.
The strtrim function exemplies a class of variadic functions where the additional arguments are
used to pass optional information to the function. Another more exible and powerful way of passing
optional information is through the use of qualiers , which is the subject of the next section.
When a S-Lang function is called with parameters, those parameters are placed on the run-time
stack. The function accesses those parameters by removing them from the stack and assigning them
to the variables in its parameter list. This details of this operation are for the most part hidden
from the programmer. But what happens when the number of parameters in the parameter list is
not equal to the number of parameters passed to the function? If the number passed to the function
is less than what the function expects, a StackUnderflow error could result as the function tries
to remove items from the stack. If the number passed is greater than the number in the parameter
list, then the extras will remain on the stack. The latter feature makes it possible to write functions
define add_10 ()
{
variable x;
x = ();
return x + 10;
}
variable s = add_10 (12); % ==> s = 22;
For the uninitiated, this example looks as if it is destined for disaster. The add_10 function appears
to accept zero arguments, yet it was called with a single argument. On top of that, the assignment
to x might look a bit strange. The truth is, the code presented in this example makes perfect sense,
First, consider what happens when add_10 is called with the parameter 12. Internally, 12 is pushed
onto the stack and then the function called. Now, consider the function add_10 itself. In it, x is a
local variable. The strange looking assignment `x=()' causes whatever is on the top of the stack to
be assigned to x. In other words, after this statement, the value of x will be 12, since 12 is at the
define function_name ()
{
variable x, y, ..., z;
z = ();
.
.
y = ();
x = ();
.
.
}
before further parsing. (The add_10 function, as dened above, is already in this form.) With this
knowledge in hand, one can write a function that accepts a variable number of arguments. Consider
the function:
if (n == 1)
{
x = ();
s = x;
}
else if (n == 2)
{
y = ();
x = ();
s = x + y;
}
else throw NotImplementedError;
return s / n;
}
variable ave1 = average_n (3.0, 1); % ==> 3.0
variable ave2 = average_n (3.0, 5.0, 2); % ==> 4.0
Here, the last argument passed to average_n is an integer reecting the number of quantities to be
averaged. Although this example works ne, its principal limitation is obvious: it only supports one
or two values. Extending it to three or more values by adding more else if constructs is rather
straightforward but hardly worth the eort. There must be a better way, and there is:
{
x = (); % get next value from stack
s += x;
}
return s / n;
}
The principal limitation of this approach is that one must still pass an integer that species how
many values are to be averaged. Fortunately, a special variable exists that is local to every function
and contains the number of values that were passed to the function. That variable has the name
define average_n ()
{
variable x, s = 0;
if (_NARGS == 0)
usage ("ave = average_n (x, ...);");
loop (_NARGS)
{
x = ();
s += x;
}
return s / _NARGS;
}
Here, if no arguments are passed to the function, the usage function will generate a UsageError
exception along with a simple message indicating how to use the function.
8.7 Qualiers
One way to pass optional information to a function is to do so using the variable arguments mech-
anism described in the previous section. However, a much more powerful mechanism is through the
plot(x,y);
Suppose that when called in the above manner, the application will plot the data as black points.
But instead of black points, one might want to plot the data using a red diamond as the plot symbol.
It would be silly to have a separate function such as plot_red_diamond for this purpose. A much
Here, a single semicolon is used to separate the argument-list proper (x,y) from the list of qualiers.
In this case, the qualiers are color and symbol. The order of the qualiers in unimportant; the
function could just as well have been called with the symbol qualier listed rst.
Note that the qualiers are not handled in the parameter list; rather they are handled in the function
body using the qualifier function, which is used to obtain the value of the qualier. The second
argument to the qualifier function species the default value to be used if the function was not
called with the specied qualier. Also note that the variable associated with the qualier need not
A qualier need not have a value its mere presence may be used to enable or disable a feature or
species a qualier called connect_points that indicates that a line should be drawn between the
data points. The presence of such a qualier can be detected using the qualifier_exists function:
Sometimes it is useful for a function to pass the qualiers that it has received to other functions.
Suppose that the plot function calls draw_symbol to plot the specied symbol at a particular
location and that it requires the symbol attibutes to be specied using qualiers. Then the plot
draw_symbol (x[i],y[i]
;color=color, size=symbol_size, symbol=symbol);
.
.
}
The problem with this approach is that it does not scale well: the plot function has to be aware
of all the qualiers that the draw_symbol function takes and explicitly pass them. In many cases
this can be quite cumbersome and error prone. Rather it is better to simply pass the qualiers that
were passed to the plot function on to the draw_symbol function. This may be achieved using the
__qualifiers function. The __qualifiers function returns the list of qualiers in the form of a
structure whose eld names are the same as the qualier names. In fact, the use of this function
can simplify the implementation of the plot function, which may be coded more simply as
Note the syntax is slightly dierent. The two semicolons indicate that the qualiers are specied
not as name-value pairs, but as a structure. Using a single semicolon would have created a qualier
As alluded to above an added benet of this approach is that the plot function does not need to
know nor care about the qualiers supported by draw_symbol. When called as
the fill qualier would get passed to the draw_symbol function to specify the ll value to be used
8.8 Exit-Blocks
An exit-block is a set of statements that get executed when a functions returns. They are very useful
for cleaning up when a function returns via an explicit call to return from deep within a function.
EXIT_BLOCK { statement-list }
where statement-list represents the list of statements that comprise the exit-block. The following
define simple_demo ()
{
variable n = 0;
8.9. Handling Return Values from a Function 59
forever
{
if (n == 10) return;
n++;
}
}
Here, the function contains an exit-block and a forever loop. The loop will terminate via the
return statement when n is 10. Before it returns, the exit-block will get executed.
A function can contain multiple exit-blocks, but only the last one encountered during execution will
if (n != 1)
{
EXIT_BLOCK { return 2; }
}
return;
}
If 1 is passed to this function, the rst exit-block will get executed because the second one would
not have been encountered during the execution. However, if some other value is passed, the second
exit-block would get executed. This example also illustrates that it is possible to explicitly return
To elaborate on this point further, consider the fputs function, which writes a string to a le de-
scriptor. This function can fail when, e.g., a disk is full, or the le is located on a network share and
S-Lang supports two mechanisms that a function may use to report a failure: raising an exception,
returning a status code. The latter mechanism is used by the S-Lang fputs function. i.e., it returns
a value to indicate whether or not is was successful. Many users familiar with this function either
seem to forget this fact, or assume that the function will succeed and not bother handling the return
value. While some languages silently remove such values from the stack, S-Lang regards the stack
as a dynamic data structure that programs can utilize. As a result, the value will be left on the
There are a number of correct ways of doing something with the return value from a function. Of
course the recommended procedure is to use the return value as it was meant to be used. In the
case of fputs, the proper thing to do is to check the return value, e.g.,
Other acceptable ways to do something with the return value include assigning it to a dummy
variable,
The last form is a special case of the multiple assignment statement , which was discussed earlier.
Since this form is simpler than assigning the value to a dummy variable or explicitly calling the pop
function, it is recommended over the other two mechanisms. Finally, this form has the redeeming
feature that it presents a visual reminder that the function is returning a value that is not being
used.
Chapter 9
Namespaces
By default, all global variables and functions are dened in the global or public namespace. In
addition to the global namespace, every compilation unit (e.g., a le containing S-Lang code) has
a private, or anonymous namespace. The private namespace is used when one wants to restrict
the usage of one or more functions or variables to the compilation unit that denes them without
Objects are declared as belonging to the private namespace using the private declaration keyword.
Similarly if a variable is declared using the public qualier, it will be placed in the public namespace.
For example,
private variable i;
public variable j;
denes a variable called i in the private namespace and one called j in the public namespace.
The implements function may be used to create a new namespace of a specied name and have it
associated with the compilation unit. Objects may be placed into this namespace space using the
static variable X;
static define foo () {...}
For this reason, such a namespace will be called the static namespace associated with the compilation
unit. Such objects may be accessed from outside the local compilation unit using the namespace
Since it is possible for three namespaces (private, static, public) to be associated with a compilation
unit, it is important to understand how names are resolved by the parser. During the compilation
stage, symbols are looked up according to the current scope. If in a function, the local variables of
the function are searched rst. Then the search proceeds with symbols in the private namespace,
followed by those in the static namespace associated with the compilation unit (if any), and nally
with the public namespace. If after searching the public namespace the symbol has not been resolved,
In addition to using the implements function, there are other ways to associate a namespace with
61
62 Chapter 9. Namespaces
a compilation unit. One is via the optional namespace argument of the evalfile function. For
example,
will cause foo.sl to be loaded and associated with a namespace called bar. Then any static symbols
then any symbols in that unit declared without an namespace qualier will be placed in the static
namespace. Otherwise such symbols will be placed in the public namespace, and any symbols
% foo.sl
variable X = 1;
static variable Y;
private variable Z;
public define set_Y (y) { Y = y; }
static define set_z (z) { Z = z; }
() = evalfile ("foo.sl");
then no static namespace will be associated with it. As a result, X will be placed in the public
namespace since it was declared with no namespace qualier. Also Y and set_z will be placed in
the private namespace since no static namespace has been associated with the le. In this scenario
there will be no way to get at the Z variable from outside of foo.sl since both it and the function
On the other hand, suppose that the le is loaded using a namespace argument:
In this case X, Y, and set_z will be placed in the foo namespace. These objects may be accessed
foo->set_z (3.0);
if (foo->X == 2) foo->Y = 1;
Because a le may be loaded with or without a namespace attached to it, it is a good idea to avoid
using the static qualier. To see this, consider again the above example but this time without the
% foo.sl
variable X = 1;
variable Y;
private variable Z;
public define set_Y (y) { Y = y; }
define set_z (z) { Z = z; }
63
When loaded without a namespace argument, the variable Z will remain in the private namespace,
Arrays
An array is a container object that can contain many values of one data type. Arrays are very useful
objects and are indispensable for certain types of programming. The purpose of this chapter is to
describe how arrays are dened and used in the S-Lang language.
Here dim0 , dim1 , ... dimN specify the size of the individual dimensions of the array. The current
implementation permits arrays to contain as many as 7 dimensions. When a numeric array is created,
all its elements are initialized to zero. The initialization of other array types depend upon the data
type, e.g., the elements in String_Type and Struct_Type arrays are initialized to NULL.
As a concrete example, consider
a = Integer_Type [10];
creates a 30 element array of double precision numbers arranged in 10 rows and 3 columns, and
assigns it to b.
an array of ten integers whose elements run from 1 through 10, one may simply use:
65
66 Chapter 10. Arrays
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
Similarly,
b = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0];
An even more compact way of specifying a numeric array is to use a range-array . For example,
a = [0:9];
species an array of 10 integers whose elements range from 0 through 9. The syntax for the most
where the increment is optional and defaults to 1. This creates an array whose rst element is
rst-value and whose successive values dier by increment . last-value sets an upper limit upon the
If the range array [a:b:c] is integer valued, then the interval specied by a and b is closed. That
is, the kth element of the array x_k is given by x_k=a+kc and satises a<=x_k<=b. Hence, the
The situation is somewhat more complicated for oating point range arrays. The interval specied
by a oating point range array [a:b:c] is semi-open such that b is not contained in the interval. In
particular, the kth element of [a:b:c] is given by x_k=a+kc such that a<=x_k<b when c>=0,
and b<x_k<=a otherwise. The number of elements in the array is one greater than the largest k
In contrast, a range-array expressed in the form [a:b:#n] represents an array of exactly n elements
running from a to b inclusive. It is equivalent toa+[0:n-1]*(b-a)/(n-1).
Here are a few examples that illustrate the above comments:
Currently Int_Type is the only integer type supported by range arrays arbitrary integer types
will be supported in a future version. This means that [1h:5h] will not produce an array of
Short_Type, rather it will produce an Int_Type array. However, [1h,2h,3h,4h,5h] will produce
Array_Type. The actual syntax for this operation resembles a function call
where data-type is of type DataType_Type and integer-array is a 1-d array of integers that specify
will create a 10 by 20 array of doubles and assign it to a. This method of creating arrays derives
its power from the fact that it is more exible than the methods discussed in this section. It is
particularly useful for creating arrays during run-time in situations where the data-type can vary.
a 1-d 10 element array may be reshaped into a 2-d array consisting of 5 rows and 2 columns. The
only restriction on the operation is that the arrays must be commensurate. The reshape function
where array-name species the array to be reshaped to the dimensions given by integer-array,
a 1-dimensional array of integers. It is important to note that this does not create a new array, it
turns a into a 10 by 10 array, as well as any other variables attached to the array.
The _reshape function works like reshape except that it creates a new array instead of changing
zeroth element of the one dimensional array a, and b[3,2] species the element in the third row
and second column of the two dimensional array b. As in C, array indices are numbered from 0.
Thus if a is a one-dimensional array of ten integers, the last element of the array is given by a[9].
Using a[10] would result in an IndexError exception.
68 Chapter 10. Arrays
A negative index may be used to index from the end of the array, with a[-1] referring to the last
element of a. Similarly, a[-2] refers to the next to the last element, and so on.
One may use the indexed value like any other variable. For example, to set the third element of an
a[2] = 6;
y = a[2] + 7;
Unlike other S-Lang variables which inherit a type upon assignment, array elements already have a
type and any attempt to assign a value with an incompatible type will result in a TypeMismatchError
exception. For example, it is illegal to assign a string value to an integer array.
One may use any integer expression to index an array. A simple example that computes the sum of
variable i, s;
s = 0;
for (i = 0; i < 10; i++) s += a[i];
(In practice, do not carry out sums this way use the sum function instead, which is much simpler
i = [6:8];
b = a[i];
Here, i is a 1-dimensional range array of three integers with i[0] equal to 6, i[1] equal to 7, and
i[2] equal to 8. The statement b = a[i]; will create a 1-d array of three doubles and assign it to
b. The zeroth element of b, b[0] will be set to the sixth element of a, or a[6], and so on. In fact,
these two simple statements are equivalent to
b = Double_Type [3];
b[0] = a[6];
b[1] = a[7];
b[2] = a[8];
except that using an array of indices is not only much more convenient, but executes much faster.
More generally, one may use an index array to specify which elements are to participate in a calcu-
a = Double_Type [1000];
i = [0:499];
j = [500:999];
a[i] = -1.0;
a[j] = 1.0;
This creates an array of 1000 doubles and sets the rst 500 elements to -1.0 and the last 500 to
1.0. Actually, one may do away with the i and j variables altogether and use
a = Double_Type [1000];
a[[0:499]] = -1.0;
a[[500:999]] = 1.0;
It is important to note that the syntax requires the use of the double square brackets, and in
particular that a[[0:499]] is not the same as a[0:499]. In fact, the latter will generate a syntax
error.
Index-arrays are not contrained to be one-dimensional arrays. Suppose that I represents a multidi-
mensional index array, and that A is the array to be indexed. Then what does A[I] represent? Its
value will be an array of the same type as A, but with the dimensionality of I . For example,
a = 1.0*[1:10];
i = _reshape ([4,5,6,7,8,9], [2,3]);
denes a to be a 10 element array of doubles, and i to be 2x3 array of integers. Then a[i] will be
Often, it is convenient to use a rubber range to specify indices. For example, a[[500:]] species
all elements of a whose index is greater than or equal to 500. Similarly, a[[:499]] species the
rst 500 elements of a. Finally, a[[:]] species all the elements of a. The latter form may also be
written as a[*].
One should be careful when using index arrays with negative elements. As pointed out above, a
negative index is used to index from the end of the array. That is, a[-1] refers to the last element
In version 1 of the interpreter, when used in an array indexing context, a construct such as [0:-1]
was taken to mean from the rst element through the last. While this might seem like a convenient
shorthand, in retrospect it was a bad idea. For this reason, the meaning of a ranges over negative
valued indices was changed in version 2 of the interpreter as follows: First the index-range gets
expanded to an array of indices according to the rules for range arrays described above. Then if any
of the resulting indices are negative, they are interpreted as indices from the end of the array. For
So, what does a[[0:-1]] represent in the new interpretation? Since [0:-1] expands to an empty
The trace of a matrix is an important concept that occurs frequently in linear algebra. The trace
of a 2d matrix is given by the sum of its diagonal elements. Consider the creation of a function that
Better yet is to recognize that the diagonal elements of an n by n array are given by an index array
[0:n*n-1:n+1]
The following example creates a 10 by 10 integer array, sets its diagonal elements to 5, and then
In the previous examples, the size of the array was passed as an additional argument. This is
unnecessary because the size may be obtained from array itself by using the array_shape function.
For example, the following function may be used to obtain the indices of the diagonal element of an
array:
if (dims[0] != dims[1])
throw InvalidParmError, "Expecting a square array";
variable n = dims[0];
return [0:n*(n-1):n+1];
}
Using this function, the trace function may be written more simply as
of space for the array, initializes it, and then assigns to the variable a reference to the array. So, a
variable that represents an array has a value that is really a reference to the array. This has several
consequences, most good and some bad. It is believed that the advantages of this representation
When a variable is passed to a function, it is always the value of the variable that gets passed. Since
the value of a variable representing an array is a reference, a reference to the array gets passed. One
major advantage of this is rather obvious: it is a fast and ecient way to pass the array. This also
where some_function is a function that generates a scalar value to initialize the ith element. This
Since the array is passed to the function by reference, there is no need to make a separate copy of the
100000 element array. As pointed out above, this saves both execution time and memory. The other
salient feature to note is that any changes made to the elements of the array within the function will
be manifested in the array outside the function. Of course, in this case this is a desirable side-eect.
a = Double_Type [10];
b = a;
a[0] = 7;
What will be the value of b[0]? Since the value of a is really a reference to the array of ten doubles,
and that reference was assigned to b, b also refers to the same array. Thus any changes made to the
elements of a, will also be made implicitly to b.
This begs the question: If the assignment of a variable attached to an an array to another variable
results in the assignment of the same array, then how does one make separate copies of the array?
There are several answers including using an index array, e.g., b = a[*]; however, the most natural
method is to use the dereference operator:
a = Double_Type [10];
b = @a;
a[0] = 7;
In this example, a separate copy of a will be created and assigned to b. It is very important to note
that S-Lang never implicitly dereferences an object. So, one must explicitly use the dereference
operator. This means that the elements of a dereferenced array are not themselves dereferenced.
a = Array_Type [2];
a[0] = Double_Type [10];
a[1] = Double_Type [10];
b = @a;
In this example, b[0] will be a reference to the array that a[0] references because a[0] was not
explicitly dereferenced.
will be two complex numbers. For simplicity, suppose that all we really want is to know what subset
of the coecients, a, b, c, correspond to real-valued solutions. In terms of for loops, we can write:
In this example, the array index_array will contain a non-zero value if the corresponding set of
coecients has a real-valued solution. This code may be written much more compactly and with
Moreover, it executes about 20 times faster than the version using an explicit loop.
S-Lang has a powerful built-in function called where. This function takes an array of boolean
values and returns an array of indices that correspond to where the elements of the input array are
non-zero. The utility of this simple operation cannot be overstated. For example, suppose a is a 1-d
array of n doubles, and it is desired to set all elements of the array whose value is less than zero to
If n is a large number, this statement can take some time to execute. The optimal way to achieve
Here, the expression (a < 0.0) returns a boolean array whose dimensions are the same size as a
but whose elements are either 1 or 0, according to whether or not the corresponding element of a is
less than zero. This array of zeros and ones is then passed to the where function, which returns a
1-d integer array of indices that indicate where the elements of a are less than zero. Finally, those
Consider once more the example involving the set of n quadratic equations presented above. Suppose
that we wish to get rid of the coecients of the previous example that generated non-real solutions.
nn = 0;
_for i (0, n-1, 1)
if (index_array [i]) nn++;
j = 0;
_for i (0, n-1, 1)
{
if (index_array [i])
{
74 Chapter 10. Arrays
Not only is this a lot of code, making it hard to digest, but it is also clumsy and error-prone. Using
the where function, this task is trivial and executes in a fraction of the time:
Most of the examples up till now assumed that the dimensions of the array were known. Although
the intrinsic function length may be used to get the total number of elements of an array, it cannot
be used to get the individual dimensions of a multi-dimensional array. The array_shape function
may be used to determine the dimensionality of an array. It may be used to determine the number
The array_shape function may also be used to create an array that has the same number of
Finally, the array_info function may be used to get additional information about an array, such as
its data type and size.
10.7. Arrays of Arrays: A Cautionary Note 75
a = Array_Type[3];
a[0] = [1:10];
a[1] = [1:100];
a[2] = [1:1000];
will produce an array of the 3 arrays [1:10], [1:100], and [1:1000]. Index arrays may be used
to access elements of an array of arrays: a[[1,2]] will produce an array of arrays that consists of the
elements a[1] and a[2]. However, it is important to note that setting the elements of an array of
arrays via an index array does not work as one might naively expect. Consider the following:
b = Array_Type[3];
b[*] = a[[2,1,0]];
where a is the array of arrays given in the previous example. The reader might expect
b to have elements b[0]=a[2], b[1]=a[1], and b[2]=a[0], and be surprised to learn that
b[0]=b[1]=b[2]=a[[2,1,0]]. The reason for this is that, by denition, b is an array of arrays, and
even though a[[2,1,0]] is an array of arrays, it is rst and foremost an array, and it is that array
Associative Arrays
An associative array diers from an ordinary array in the sense that its size is not xed and that it
A = Assoc_Type [Int_Type];
A["alpha"] = 1;
A["beta"] = 2;
A["gamma"] = 3;
Here, A has been assigned to an associative array of integers (Int_Type) and then three keys were
As the example suggests, an associative array may be created using one of the following forms:
The last form returns an un-typed associative array capable of storing values of any type.
The form involving a default-value is useful for associating a default value with non-existent array
There are several functions that are specially designed to work with associative arrays. These include:
• assoc_get_keys, which returns an ordinary array of strings containing the keys of the array.
• assoc_get_values, which returns an ordinary array of the values of the associative array. If
the associative array is un-typed, then an array of Any_Type objects will be returned.
• assoc_key_exists, which can be used to determine whether or not a key exists in the array.
• assoc_delete_key, which may be used to remove a key (and its value) from the array.
To illustrate the use of an associative array, consider the problem of counting the number of repeated
occurrences of words in a list. Let the word list be represented as an array of strings given by
word_list. The number of occurrences of each word may be stored in an associative array as
follows:
77
78 Chapter 11. Associative Arrays
a = Assoc_Type [Int_Type];
foreach word (word_list)
{
if (0 == assoc_key_exists (a, word))
a[word] = 0;
a[word]++; % same as a[word] = a[word] + 1;
}
Note that assoc_key_exists was necessary to determine whether or not a word was already added
to the array in order to properly initialize it. However, by creating the associative array with a
variable a, word;
a = Assoc_Type [Int_Type, 0];
foreach word (word_list)
a[word]++;
Associative arrays are extremely useful and have may other applications. Whenever there is a one
to one mapping between a string and some object, one should always consider using an associative
array to represent the mapping. To illustrate this point, consider the following code fragment:
This represents a mapping between names and functions. Such a mapping may be written in terms
The most redeeming feature of the version involving the series of if statements is that it is easy
to understand. However, the version involving the associative array has two signicant advantages
over the former. Namely, the function lookup will be much faster with a time that is independent
of the item being searched, and it is extensible in the sense that additional functions may be added
at run-time, e.g.,
A structure is a heterogeneous container object, i.e., it is an object with elements whose values do
not have to be of the same data type. The elements or elds of a structure are named, and one
accesses a particular eld of the structure via the eld name. This should be contrasted with an
array whose values are of the same type, and whose elements are accessed via array indices.
A user-dened data type is a structure with a xed set of elds dened by the user.
This creates and returns a structure with N elds whose names are specied by eld-name-1 , eld-
name-2 , ..., eld-name-N . When a structure is created, the values of its elds are initialized to
NULL.
For example,
This approach is useful when creating structures dynamically where one does not know the name of
Like arrays, structures are passed around by reference. Thus, in the above example, the value of t
is a reference to the structure. This means that after execution of
81
82 Chapter 12. Structures and User-Dened Types
u = t;
both t and u refer to the same underlying structure, since only the reference was copied by the
assignment. To actually create a new copy of the structure, use the dereference operator, e.g.,
variable u = @t;
It create new structure whose eld names are identical to the old and copies the eld values to the
new structure. If any of the values are objects that are passed by reference, then only the references
t = struct{a};
t.a = [1:10];
u = @t;
field_name is a eld of the structure, then s.field_name species that eld of s. This specication
can be used in expressions just like ordinary variables. Again, consider
linked-lists . A linked-list is simply a chain of structures that are linked together such that one
structure in the chain is the value of a eld of the previous structure in the chain. To be concrete,
and suppose that it is desired to create a linked-list of such objects to store population data. The
purpose of the next eld is to provide the link to the next structure in the chain. Suppose that
there exists a function, read_next_city, that reads city names and populations from a le. Then
define create_population_list ()
{
variable city_name, population, list_root, list_tail;
variable next;
list_root = NULL;
while (read_next_city (&city_name, &population))
{
next = struct {city_name, population, next };
next.city_name = city_name;
next.population = population;
next.next = NULL;
if (list_root == NULL)
list_root = next;
else
list_tail.next = next;
list_tail = next;
}
return list_root;
}
In this function, the variables list_root and list_tail represent the beginning and end of the
list, respectively. As long as read_next_city returns a non-zero value, a new structure is created,
initialized, and then appended to the list via the next eld of the list_tail structure. On the rst
time through the loop, the list is created via the assignment to the list_root variable.
Other functions may be created that manipulate the list. Here is one that nds the city with the
largest population:
largest = list;
while (list != NULL)
{
if (list.population > largest.population)
largest = list;
list = list.next;
}
return largest.city_name;
}
84 Chapter 12. Structures and User-Dened Types
The get_largest_city is a typical example of how one traverses a linear linked-list by starting at
the head of the list and successively moves to the next element of the list via the next eld.
In the previous example, a while loop was used to traverse the linked list. It is also possible to use
largest = list;
foreach elem (list)
{
if (elem.population > largest.population)
largest = elem;
}
return largest.city_name;
}
Here a foreach loop has been used to walk the list via its next eld. If the eld name linking
the elements was not called next, then it would have been necessary to use the using form of the
foreach statement. For example, if the eld name implementing the linked list was next_item,
then
would have been used. In other words, unless otherwise indicated via the using clause, foreach
walks the list using a eld named next by default.
Now consider a function that sorts the list according to population. To illustrate the technique, a
bubble-sort will be used, not because it is ecient (it is not), but because it is simple, intuitive, and
return list;
}
Note the test for equality between list and node, i.e.,
It is important to appreciate the fact that the values of these variables are references to structures,
and that the comparison only compares the references and not the actual structures they reference.
a user-dened data type is essentially a structure with a user-dened set of elds. For example, in
the previous section a structure was used to represent a city/population pair. We can dene a data
typedef struct
{
city_name,
population
} Population_Type;
This data type can be used like all other data types. For example, an array of Population_Type
variable a = Population_Type[10];
a[0].city_name = "Boston";
a[0].population = 2500000;
The new type Population_Type may also be used with the typeof function:
a = @Population_Type;
a.city_name = "Calcutta";
a.population = 13000000;
Another feature that user-dened types possess is that the action of the binary and unary operations
may be dened for them. This idea is discussed in more detail below.
This function may be used to dene a function that adds two vectors together:
Using these functions, three vectors representing the points (2,3,4), (6,2,1), and (-3,1,-6) may
be created using
V1 = vector_new (2,3,4);
V2 = vector_new (6,2,1);
V3 = vector_new (-3,1,-6);
12.5. Operator Overloading 87
The problem with the last statement is that it is not a very natural way to express the addition of
three vectors. It would be far better to extend the action of the binary + operator to the Vector_Type
objects and then write the above sum more simply as
V4 = V1 + V2 + V3;
The __add_binary function denes the result of a binary operation between two data types:
Here, op is a string representing any one of the binary operators ("+", "-", "*", "/", "==",...), and
funct is reference to a function that carries out the binary operation between objects of types typeA
and typeB to produce an object of type result-type .
Similarly the subtraction and equality operators may be extended to Vector_Type via
The - operator is also an unary operator that is customarily used to change the sign of an object.
Unary operations may be extended to Vector_Type objects using the __add_unary function:
may be multiplied by a scalar to produce another vector. This can happen in two ways as reected
Here a represents the scalar, which can be any object that may be multiplied by a Double_Type,
e.g., Int_Type, Float_Type, etc. Instead of using multiple statements involving __add_binary
There are a couple of natural possibilities for Vector_Type*Vector_Type: The cross-product dened
by
The binary * operator between two vector types may be dened to be just one of these functions
it cannot be extended to both. If the dot-product is chosen then one would use
Just because it is possible to dene the action of a binary or unary operator on an user-dened
type, it is not always wise to do so. A useful rule of thumb is to ask whether dening a particular
operation leads to more readable and maintainable code. For example, simply looking at
c = a + b;
12.5. Operator Overloading 89
in isolation one can easily overlook the fact that a function such as vector_add may be getting
executed. Moreover, in cases where the action is ambiguous such as Vector_Type*Vector_Type it
may not be clear what
c = a*b;
means unless one knows exactly what choice was made when extending the * operator to the types.
For this reason it may be wise to leave Vector_Type*Vector_Type undened and use old-fashioned
function calls such as
Finally, the __add_string function may be used to dene the string representation of an object.
For the Vector_Type one might want to use the string represention generated by
Lists
Sometimes it is desirable to utilize an object that has many of the properties of an array, but can
also easily grow or shrink upon demand. The List_Type object has such properties.
An empty list may be created either by the list_new function or more simply using curly braces,
e.g.,
list = {};
More generally a list of objects may be created by simply enclosing them in braces. For example,
species a list of 4 elements, where the last element is also a list. The number of items in a list may
be obtained using the length function. For the above list, length(list) will return 4.
One may examine the contents of the list using an array index notation. For the above example,
list[0] refers to the zeroth element of the list ("hello" in this case). Similarly,
list[1] = [1,2,3];
changes the rst element of the list (7) to the array [1,2,3]. Also as the case for arrays one may
index from the end of the list using negative indices, e.g., list[-1] refers to the last element of the
list.
list_insert(list,obj,nth) will insert the object obj into the list at the nth position. Similarly,
list_append(list,obj,nth) will insert the object obj into the list right after nth position. If
then
91
92 Chapter 13. Lists
to insert "hi" at the head of the list. However, this simply creates a new list of two items: hi and
Items may be removed from a list via the list_delete function, which deletes the item from the
specied position and shrinks the list. In the context of the above example,
Another way of removing items from the list is to use the list_pop function. The main dierence
between it and list_delete is that list_pop returns the deleted item. For example,
and assign {&sin,&cos} to item. If the position parameter to list_pop is left unspecied, then the
position will default to the zeroth, i.e., list_pop(list) list_pop(list,0).
is equaivalent to
new_list = @list;
Keep in mind that this does not perform a so-called deep copy. If any of the elements of the list are
objects that are assigned by reference, only the references will be copied.
The list_reverse function may be used to reverse the elements of a list. Note that this does not
create a new list. To create new list that is the reverse of another, copy the original using the
Error Handling
All non-trivial programs or scripts must be deal with the possibility of run-time errors. In fact, one
sign of a seasoned programmer is that such a person pays particular attention to error handling. This
chapter presents some techniques for handling errors using S-Lang. First the traditional method of
using return values to indicate errors will be discussed. Then attention will turn to S-Lang's more
powerful exception handling mechanisms.
Here, the write_to_file function returns 0 if successful, or -1 upon failure. It is up to the calling
routine to check the return value of write_to_file and act accordingly. For instance:
The main advantage of this technique is in its simplicity. The weakness in this approach is that the
return value must be checked for every function that returns information in this way. A more subtle
93
94 Chapter 14. Error Handling
problem is that even minor changes to large programs can become unwieldy. To illustrate the latter
aspect, consider the following function which is supposed to be so simple that it cannot fail:
define simple_function ()
{
do_something_simple ();
more_simple_stuff ();
}
Since the functions called by simple_function are not supposed to fail, simple_function itself
cannot fail and there is no return value for its callers to check:
define simple ()
{
simple_function ();
another_simple_function ();
}
Now suppose that the function do_something_simple is changed in some way that could cause it to
fail from time to time. Such a change could be the result of a bug-x or some feature enhancement.
In the traditional error handling approach, the function would need to be modied to return an
error code. That error code would have to be checked by the calling routine simple_function and
as a result, it can now fail and must return an error code. The obvious eect is that a tiny change in
one function can be felt up the entire call chain. While making the appropriate changes for a small
program can be a trivial task, for a large program this could be a major undertaking opening the
possibility of introducing additional errors along the way. In a nutshell, this is a code maintenance
issue. For this reason, a veteran programmer using this approach to error handling will consider
such possibilities from the outset and allow for error codes the rst time regardless of whether the
define simple_function ()
{
if (-1 == do_something_simple ())
return -1;
if (-1 == more_simple_stuff ())
return -1;
return 0;
}
define simple ()
{
if (-1 == simple_function ())
return -1;
if (-1 == another_simple_function ())
return -1;
return 0;
}
Although latter code containing explicit checks for failure is more robust and more easily maintain-
able than the former, it is also less readable. Moreover, since return values are now checked the
code will execute somewhat slower than the code that lacks such checks. There is also no clean
14.2. Error Handling through Exceptions 95
separation of the error handling code from the other code. This can make it dicult to maintain if
the error handling semantics of a function change. The next section discusses another approach to
error, instead of returning an error code, it simply gives up and throws an exception. This idea will
tion:
Here the throw statement has been used to generate the appropriate exception, which in this case is
either an OpenError exception or a WriteError exception. Since the function now returns nothing
(no error code), it may be called as
As long as the write_to_file function encounters no errors, control passes from write_to_file
to next_statement.
Now consider what happens when the function encounters an error. For concreteness assume
caller. Since no provision has been made to handle the exception, next_statement will not execute
and control will pass to the previous caller on the call stack. This process will continue until the
exception is either handled or until control reaches the top-level at which point the interpreter will
An simple exception handler may be created through the use of a try-catch statement, such as
try
{
96 Chapter 14. Error Handling
The above code works as follows: First the statement (or statements) inside the try-block are
executed. As long as no exception occurs, once they have executed, control will pass on to
If an exception occurs while executing the statements in the try-block, any remaining statements
in the block will be skipped and control will pass to the catch portion of the exception handler.
This may consist of one or more catch statements and an optional nally statement. Each catch
statement species a list of exceptions it will handle as well as the code that is to be excecuted
when a matching exception is caught. If a matching catch statement is found for the exception,
the exception will be cleared and the code associated with the catch statement will get executed.
Control will then pass to next_statement (or rst to the code in an optional finally block).
Catch-statements are tested against the exception in the order that they appear. Once a matching
catch statement is found, the search will terminate. If no matching catch-statement is found, an
optional finally block will be processed, and the call-stack will continue to unwind until either a
In the above example, an exception handler was established for the OpenError exception. The error
handling code for this exception will cause a warning message to be displayed. Execution will resume
at next_statement.
Now suppose that write_to_file successfully opened the le, but that for some reason, e.g., a full
disk, the actual write operation failed. In such a case, write_to_file will throw a WriteError
exception passing control to the caller. The le will remain on the disk but not fully written. An
exception handler can be added for WriteError that removes the le:
try
{
write_to_file ("/tmp/foo", "bar");
}
catch OpenError:
{
message ("*** Warning: failed to open /tmp/foo.");
}
catch WriteError:
{
() = remove ("/tmp/foo");
message ("*** Warning: failed to write to /tmp/foo");
}
next_statement;
Here the exception handler for WriteError uses theremove intrinsic function to delete the le and
then issues a warning message. Note that the remove intrinsic uses the traditional error handling
mechanism in the above example its return status has been discarded.
14.2. Error Handling through Exceptions 97
Above it was assumed that failure to write to the le was not critical allowing a warning message
to suce upon failure. Now suppose that it is important for the le to be written but that it is still
desirable for the le to be removed upon failure. In this scenario, next_statement should not get
try
{
write_to_file ("/tmp/foo", "bar");
}
catch WriteError:
{
() = remove ("/tmp/foo");
throw WriteError;
}
next_statement;
Here the exception handler for WriteError removes the le and then re-throws the exception.
error
The exception error code (Int_Type).
descr
A brief description of the error (String_Type).
le
The lename containing the code that generated the exception (String_Type).
line
The line number where the exception was thrown (Int_Type).
function
The name of the currently executing function, or NULL if at top-level (String_Type).
message
A text message that may provide more information about the exception (String_Type).
object
A user-dened object.
If it is desired to have information about the exception, then an alternative form of the try statement
must be used:
98 Chapter 14. Error Handling
try (e)
{
% try-block code
}
catch SomeException: { code ... }
If an exception occurs while executing the code in the try-block, then the variablee will be assigned
the value of the exception object. As a simple example, suppose that the le foo.sl consists of:
try (e)
{
y = invert_x (0);
}
catch DivideByZeroError:
{
vmessage ("Caught %s, generated by %s:%d\n",
e.descr, e.file, e.line);
vmessage ("message: %s\nobject: %S\n",
e.message, e.object);
y = 0;
}
message eld was assigned a default value. The reason that the object
In this case, the value of the
eld isNULL is that no object was specied when the exception was generated. In order to throw an
object, a more complex form of throw statement must be used:
To illustrate this form, suppose that invert_x is modied to accept an array object:
throw DivideByZeroError,
"Array contains elements that are zero", i;
return 1/x;
}
In this case, the message eld of the exception will contain the string "Array contains elements
that are zero" and the object eld will be set to the indices of the zero elements.
The last clause of a try-statement is the nally-block , which is optional and is introduced using the
finally keyword. If the try-statement contains no catch-clauses, then it must specify a nally-
If the nally-clause is present, then its corresponding statements will be executed regardless of
whether an exception occurs. If an exception occurs while executing the statements in the try-
block, then the nally-block will execute after the code in any of the catch-blocks. The nally-clause
is useful for freeing any resources (le handles, etc) allocated by the try-block regardless of whether
AnyError
OSError
MallocError
ImportError
ParseError
SyntaxError
DuplicateDefinitionError
UndefinedNameError
RunTimeError
InvalidParmError
TypeMismatchError
UserBreakError
StackError
StackOverflowError
StackUnderflowError
100 Chapter 14. Error Handling
ReadOnlyError
VariableUnitializedError
NumArgsError
IndexError
UsageError
ApplicationError
InternalError
NotImplementedError
LimitExceededError
MathError
DivideByZeroError
ArithOverflowError
ArithUnderflowError
DomainError
IOError
WriteError
ReadError
OpenError
DataError
UnicodeError
InvalidUTF8Error
UnknownError
for AnyError will catch any exception. The OSError, ParseError, and RunTimeError exceptions are
subclasses of the AnyError class. Subclasses of OSError include MallocError, and ImportError.
Hence a handler for the OSError exception will catch MallocError but not ParseError since the
The user may extend this tree with new exceptions using the new_exception function. This function
The exception-name is the name of the exception, baseclass represents the node in the exception
hierarchy where it is to be placed, and description is a string that provides a brief description of the
exception.
For example, suppose that you are writing some code that processes numbers stored in a binary
format. In particular, assume that the format species that data be stored in a specic byte-
order, e.g., in big-endian form. Then it might be useful to extend the DataError exception with
This will create a new exception object called EndianError subclassed on DataError, and code that
catches the DataError EndianError exception.
exception will additionally catch the
Chapter 15
101
102 Chapter 15. Loading Files: evalle, autoload, and require
Chapter 16
Modules
16.1 Introduction
A module is a shared object that may be dynamically linked into the interpreter at run-time to
provide the interpreter with additional intrinsic functions and variables. Several modules are dis-
tributed with the stock version of the S-Lang library, including a pcre module that allows the
interpreter to make use of the Perl Compatible Regular Expression library , a png module that al-
lows the interpreter to easily read and write PNG les, and a rand module for producing random
numbers. There are also a number of modules for the interpreter that are not distributed with the
ways to go about this. One is to use the import function to dynamically link-in the specied module,
e.g.,
import ("pcre");
will dynamically link to the pcre module and make its symbols available to the interpreter using
the active namespace. However, this is not the preferred method for loading a module.
Module writers are encouraged to distribute a module with a le of S-Lang code that performs the
actual import of the module. Rather than a user making direct use of the import function, the
preferred method of loading the modules is to load that le instead. For example the pcre module is
distributed with a le called pcre.sl that contains little more than the import("pcre") statement.
require ("pcre");
The main advantage of this approach to loading a module is that the functionality provided by the
module may be split between intrinsic functions in the module proper, and interpreted functions
contained in the .sl le. In such a case, loading the module via import would only provide partial
103
104 Chapter 16. Modules
functionality. The png module provides a simple example of this concept. The current version of
the png module consists of a couple intrinsic functions (png_read and png_write) contained in the
shared object (png-module.so), and a number of other interpreted S-Lang functions dened in
png.sl. Using the import statement to load the module would miss the latter set of functions.
In some cases, the symbols in a module may conict with symbols that are currently dened by
the interpreter. In order to avoid the conict, it may be necessary to load the module into its own
namespace and access its symbols via the namespace prex. For example, the GNU Scientic Library
Special Function module, gslsf, denes a couple hundred functions, some with common names such
as zeta. In order to avoid any conict, it is recommended that the symbols from such a module be
imported into a separate namespace. This may be accomplished by specifying the namespace as a
This form requires that the module's symbols be accessed via the namespace qualier "gsl->".
Chapter 17
File Input/Output
S-Lang provides built-in support for two dierent I/O facilities. The simplest interface is modeled
upon the C language stdio interface and consists of functions such as fopen, fgets, etc. The other
interface is modeled on a lower level POSIX interface consisting of functions such as open, read,
etc. In addition to permitting more control, the lower level interface permits one to access network
For reading data formatted in text les, e.g., columns of numbers, then do not overlook the high-level
routines in the slsh library. In particular, the readascii function is quite exible and can read
data from text les that are formatted in a variety of ways. For data stored in a standard binary
format such as HDF or FITS, then the corresponding modules should be used.
• fread_bytes: reads a specied number of bytes from a le and returns them as a string.
• ferror: tests whether or not the stream associated with a le has an error.
105
106 Chapter 17. File Input/Output
• fflush, forces all buered data associated with a stream to be written out.
• fgetslines: reads all the lines from a text le and returns them as an array of strings.
In addition, the interface supports the popen and pclose functions on systems where the corre-
Before reading or writing to a le, it must rst be opened using the fopen function. The only
exceptions to this rule involve use of the pre-opened streams: stdin, stdout, and stderr. fopen
accepts two arguments: a le name and a string argument that indicates how the le is to be
opened, e.g., for reading, writing, update, etc. It returns a File_Type stream object that is used
as an argument to all other functions of the stdio interface. Upon failure, it returns NULL. See the
to realize that all the functions of the interface return something, and that return value must be
The rst example involves writing a function to count the number of lines in a text le. To do this,
count = 0;
while (-1 != fgets (&line, fp))
count++;
() = fclose (fp);
return count;
}
Note that &line was passed to the fgets function. When fgets line will contain the line
returns,
of text read in from the le. Also note how the return value from fclose was handled (discarded in
this case).
Although the preceding example closed the le via fclose, there is no need to explicitly close a
le because the interpreter will automatically close a le when it is no longer referenced. Since the
17.1. Input/Output via stdio 107
only variable to reference the le is fp, it would have automatically been closed when the function
returned.
Suppose that it is desired to count the number of characters in the le instead of the number of
lines. To do this, the while loop could be modied to count the characters as follows:
The main diculty with this approach is that it will not work for binary les, i.e., les that contain
null characters. For such les, the le should be opened in binary mode via
The fread function requires two additional arguments: the type of object to read (Char_Type in
the case), and the number of such objects to be read. The function returns the number of objects
actually read in the form of an array of the specied type, or -1 upon failure.
Sometimes it is more convenient to obtain the data from a le in the form of a character string
instead of an array of characters. The fread_bytes function may be used in such situations. Using
The foreach construct also works with File_Type objects. For example, the number of characters
Often one is not interested in trailing whitespace in the lines of a le. To have trailing whitespace
automatically stripped from the lines as they are read in, use the "wsline" form, e.g.,
Finally, it should be mentioned that none of these examples should be used to count the number of
bytes in a le when that information is more readily accessible by another means. For example, it
st = stat_file (file);
if (st == NULL)
throw IOError, "stat_file failed";
return st.st_size;
}
would result in a Double_Type[num] array being assigned to a if successful. However, suppose that
the binary data le consists of numbers in a specied byte-order. How can one read such objects
with the proper byte swapping? The answer is to use the fread_bytes function to read the objects
as a (binary) character string and then unpack the resulting string into the specied data type, or
types. This process is facilitated using the pack and unpack functions.
and combines the objects in the item-list according to format-string into a binary string and returns
the result. Likewise, the unpack function may be used to convert a binary string into separate data
objects:
The format string consists of one or more data-type specication characters, and each may be
followed by an optional decimal length specier. Specically, the data-types are specied according
c char
C unsigned char
h short
H unsigned short
i int
I unsigned int
17.3. Advanced I/O techniques 109
l long
L unsigned long
j 16 bit int
J 16 unsigned int
k 32 bit int
K 32 bit unsigned int
f float
d double
F 32 bit float
D 64 bit float
s character string, null padded
S character string, space padded
z character string, null padded
x a null pad character
A decimal length specier may follow the data-type specier. With the exception of the s and S
speciers, the length specier indicates how many objects of that data type are to be packed or
unpacked from the string. When used with the s or S speciers, it indicates the eld width to be
used. If the length specier is not present, the length defaults to one.
With the exception of c, C, s, S, z, and x, each of these may be prexed by a character that indicates
the byte-order of the object:
Here are a few examples that should make this more clear:
When unpacking, if the length specier is greater than one, then an array of that length will be
returned. In addition, trailing whitespace and null characters are stripped when unpacking an object
pages, and consists of a sequence of entries formatted according to the C structure utmp dened in
the utmp.h C header le. The actual details of the structure may vary from one version of Unix
to the other. For the purposes of this example, consider its denition under the Linux operating
struct utmp {
short ut_type; /* type of login */
pid_t ut_pid; /* pid of process */
char ut_line[12]; /* device name of tty - "/dev/" */
char ut_id[2]; /* init id or abbrev. ttyname */
time_t ut_time; /* login time */
char ut_user[8]; /* user name */
char ut_host[16]; /* host name for remote login */
long ut_addr; /* IP addr of remote host */
};
On this system, pid_t is dened to be an int and time_t is a long. Hence, a format specier for
However, this particular denition is naive because it does not allow for structure padding performed
by the C compiler in order to align the data types on suitable word boundaries. Fortunately, the
intrinsic function pad_pack_format may be used to modify a format by adding the correct amount of
padding in the right places. In fact,pad_pack_format applied to the above format on an Intel-based
Linux system produces the result:
The other missing piece of information is the size of the structure. This is useful because we would
like to read in one structure at a time using the fread function. Knowing the size of the various data
types makes this easy; however it is even easier to use the sizeof_pack intrinsic function, which
returns the size (in bytes) of the structure described by the pack format.
So, with all the pieces in place, it is rather straightforward to write the code:
typedef struct
{
ut_type, ut_pid, ut_line, ut_id,
ut_time, ut_user, ut_host, ut_addr
} UTMP_Type;
variable U = @UTMP_Type;
() = fclose (fp);
A few comments about this example are in order. First of all, note that a new data type called
UTMP_Type was created, although this was not really necessary. The le was opened in binary mode,
but this too was optional because, for example, on a Unix system there is no distinction between
binary and text modes. The print_utmp function does not print all of the structure elds. Finally,
last but not least, the return values from fprintf and fclose were handled by discarding them.
112 Chapter 17. File Input/Output
Chapter 18
slsh
slsh, also known as the S-Lang shell, is an application that is included in the stock S-Lang distribu-
tion. As some binary distributions include slsh as a separate package it must be installed separately,
e.g.,
on Debian Linux systems. The use of slsh in its interactive mode was discussed briey in the 1
(Introduction). This chapter concentrates on the use of slsh for writing executable S-Lang scripts.
# slsh --help
Usage: slsh [OPTIONS] [-|file [args...]]
--help Print this help
--version Show slsh version information
-e string Execute 'string' as S-Lang code
-g Compile with debugging code, tracebacks, etc
-n Don't load personal init file
--init file Use this file instead of ~/.slshrc
--no-readline Do not use readline
-i Force interactive input
-t Test mode. If slsh_main exists, do not call it
-v Show verbose loading messages
-Dname Define "name" as a preprocessor symbol
When started with no arguments, slsh will start in interactive mode and take input from the
terminal. As the usage message indicates slsh loads a personal initialization le called .slshrc (on
113
114 Chapter 18. slsh
non-Unix systems, this le is called slsh.rc). The contents of this le must be valid S-Lang code,
but are otherwise arbitrary. One use of this le is to dene commonly used functions and to setup
slsh will run in non-interactive mode when started with a le (also known as a script) as its rst
(non-option) command-line argument. The rest of the arguments on the command line serve as
arguments for the script. The next section deals with the use of the cmdopt routines for parsing
those arguments.
a public function called slsh_main, then slsh will call it after the script has been loaded. In this
#!/usr/bin/env slsh
.
.
define slsh_main ()
{
.
.
}
The rst line of the script Unix-specic and should be familiar to Unix users. Typically, the code
before slsh_main will load any required modules or packages, and dene other functions to be used
by the script.
Although the use of slsh_main is not required, its use is strongly urged for several reasons. In
addition to lending uniformity to S-Lang scripts, slsh_main is well supported by the S-Lang
debugger (sldb) and the S-Lang proler (slprof), which look for slsh_main as a starting point for
script execution. Also as scripts necessarily do something (otherwise they have no use), slsh's -t
command-line option may be used to turn o the automatic execution of slsh_main. This allows
the syntax of the entire script to be checked for errors instead of running it.
encoded les. The name of the script is cd2ogg.sl. Running the script without arguments causes
As the message shows, some of the options require an argument while others do not. The cd2ogg.sl
#!/usr/bin/env slsh
require ("cmdopt");
.
.
private define exit_usage ()
{
() = fprintf (stderr, "Usage: %s [options] device\n",
path_basename (__argv[0]));
() = fprintf (stderr, "Options:\n");
.
.
exit (1);
}
define slsh_main ()
{
variable genre = NULL;
variable no_rip = 0;
variable no_normalize = 0;
variable no_encode = 0;
There are several points that one should take from the above example. First, to use the cmdopt
interface it is necessary to load it. This is accomplished using the require statement. Second, the
above example uses cmdopt's object-oriented style interface through the use of the add and process
methods of the cmdopt object created by the call to cmdopt_new. Third, two of the command
line options make use of callback functions: the exit_usage function will get called when help
appears on the command line, and the parse_album_info function will get called to handle the
albuminfo option. Options such as no-encode do not take a value and the presence of such an
option on the command line causes the variable associated with the option to be set to 1. Other
options such as genre will cause the variable associated with them to be set of the value specied
on the command-line. Finally, the process method returns the index of __argv that corresponds to
non-option argument. In this case, for proper usage of the script, that argument would correspond
For more information about the cmdopt interface, see the documentation for cmdopt_add:
Debugging
There are several ways to debug a S-Lang script. When the interpreter encounters an uncaught
exception, it can generate a traceback report showing where the error occurred and the values of
local variables in the function call stack frames at the time of the error. Often just knowing where
the error occurs is all that is required to correct the problem. More subtle bugs may require a deeper
analysis to diagnose the problem. While one can insert the appropriate print statements in the code
to get some idea about what is going on, it may be simpler to use the interactive debugger.
19.1 Tracebacks
When the value of the _traceback variable is non-zero, the interpreter will generate a traceback
report when it encounters an error. This variable may be set by putting the line
_traceback = 1;
at the top of the suspect le. If the script is running in slsh, then invoking slsh using the -g option
will enable tracebacks:
slsh -g myscript.sl
If _traceback is set to a positive value, the values of local variables will be printed in the traceback
report. If set to a negative integer, the values of the local variables will be absent.
Traceback: error
***string***:1:verror:Run-Time Error
/grandpa/d1/src/jed/lib/search.sl:78:search_generic_search:Run-Time Error
Local Variables:
String_Type prompt = "Search forward:"
Integer_Type dir = 1
Ref_Type line_ok_fun = &_function_return_1
String_Type str = "ascascascasc"
Char_Type not_found = 1
117
118 Chapter 19. Debugging
Integer_Type cs = 0
/grandpa/d1/src/jed/lib/search.sl:85:search_forward:Run-Time Error
There are several ways to read this report; perhaps the simplest is to read it from the bot-
tom. This report says that on line 85 in search.sl the search_forward function called the
search_generic_search function. On line 78 it called the verror function, which in turn called
error. The search_generic_search function contains 6 local variables whose values at the time of
the error are given by the traceback output. The above example shows that a local variable called
functions that use these hooks to implement a simple debugger. Although written for slsh, the
debugger may be used by other S-Lang interpreters that permit the loading of slsh library les.
This can be in done several ways, depending upon the application embedding the interpreter.
For applications that support a command line, the simplest way to access the debugger is to use the
require ("sldb");
sldb ("foo.sl");
When called without an argument, sldb will prompt for input. This can be useful for setting or
removing breakpoints.
require ("sldb");
sldb_enable ();
at the top of the suspect le. Any les loaded by the le will also be compiled with debugging
If the le contains any top-level executable statements, the debugger will display the line to be
executed and prompt for input. If the le does not contain any executable statements, the debugger
will not be activated until one of the functions in the le is executed.
As a concrete example, consider the following contrived slsh script called buggy.sl:
variable y = x*x;
variable i;
_for i (0, length(x), 1)
{
variable z = divide (x, y, i);
() = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);
}
}
slsh buggy.sl
yields
More information may be obtained by using slsh's -g option to cause a traceback report to be
printed:
slsh -g buggy.sl
Expecting Double_Type, found Array_Type
Traceback: fprintf
./buggy.sl:13:slsh_main:Type Mismatch
Local variables for slsh_main:
Array_Type x = Integer_Type[5]
Array_Type y = Integer_Type[5]
Integer_Type i = 0
Array_Type z = Integer_Type[5]
Error encountered while executing slsh_main
From this one can see that the problem is that z is an array and not a scalar as expected.
To run the program under debugger control, startup slsh and load the le using the sldb function:
Note the use of "./" in the lename. This may be necessary if the le is not in the slsh search path.
The above command causes execution to stop with the following displayed:
slsh_main at ./buggy.sl:9
9 variable x = [1:5];
(sldb)
This shows that the debugger has stopped the script at line 9 of buggy.sl and is waiting for input.
The print function may be used to print the value of an expression or variable. Using it to display
(sldb) print x
Caught exception:Variable Uninitialized Error
(sldb)
120 Chapter 19. Debugging
This is because x has not yet been assigned a value and will not be until line 9 has been executed.
The next command may be used to execute the current line and stop at the next one:
(sldb) next
10 variable y = x*x;
(sldb)
The step command functions almost the same as next, except when a function call is involved. In
such a case, the next command will step over the function call but step will cause the debugger to
(sldb) print x
Integer_Type[5]
(sldb) print x[0]
1
(sldb) print x[-1]
5
(sldb)
The list command may be used to get a list of the source code around the current line:
(sldb) list
5 return a[i] / b;
6 }
7 define slsh_main ()
8 {
9 variable x = [1:5];
10 variable y = x*x;
11 variable i;
12 _for i (0, length(x), 1)
13 {
14 variable z = divide (x, y, i);
15 () = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);
(sldb) break 15
breakpoint #1 set at ./buggy.sl:15
The cont command may be used to continue execution until the next break point:
(sldb) cont
Breakpoint 1, slsh_main
at ./buggy.sl:15
15 () = fprintf (stdout, "%g/%g = %g", x[i], y[i], z);
(sldb)
This shows that during the execution of line 15, a TypeMismatchError was generated. Let's see
This shows that the problem was caused by z being an array and not a scalar something that was
already known from the traceback report. Now let's see why it is not a scalar. Start the program
slsh_main at ./buggy.sl:9
9 variable x = [1:5];
(sldb) break divide
breakpoint #1 set at divide
(sldb) cont
Breakpoint 1, divide
at ./buggy.sl:5
5 return a[i] / b;
(sldb)
From this it is easy to see that z is an array because b is an array. The x for this is to change line
5 to
z = a[i]/b[i];
The debugger supports several other commands. For example, the up and down commands may
be used to move up and down the stack-frames, and where command may be used to display the
stack-frames. These commands are useful for examining the variables in the other frames:
(sldb) where
#0 ./buggy.sl:5:divide
#1 ./buggy.sl:14:slsh_main
(sldb) up
#1 ./buggy.sl:14:slsh_main
14 variable z = divide (x, y, i);
(sldb) print x
122 Chapter 19. Debugging
Integer_Type[5]
(sldb) down
#0 ./buggy.sl:5:divide
5 return a[i] / b;
(sldb) print z
Integer_Type[5]
On some operating systems, the debugger's watchfpu command may be used to help isolate oating
point exceptions. Consider the following example:
<top-level> at ./example.sl:12
11 print_root (1,2,3);
(sldb) watchfpu FE_INVALID
(sldb) cont
*** FPU exception bits set: FE_INVALID
Entering the debugger.
solve_quadratic at ./t.sl:4
4 variable x = -b + sqrt (d);
The watchfpu command may be used to watch for the occurrence of any combination of the following
exceptions
FE_DIVBYZERO
FE_INEXACT
FE_INVALID
FE_OVERFLOW
FE_UNDERFLOW
by the bitwise-or operation of the desired combination. For instance, to track both FE_INVALID and
FE_OVERFLOW, use:
Proling
20.1 Introduction
This chapter deals with the subject of writing ecient S-Lang code, and using the S-Lang proler
to isolate places in the code that could benet from optimization.
The most important consideration in writing ecient code is the choice of algorithm. A poorly
optimized good algorithm will almost always execute faster than a highly optimized poor algorithm.
In choosing an algorithm, it is also important to choose the right data structures for its implemen-
tation. As a simple example, consider the task of counting words. Any algorithm would involve a
some sort of table with word/number pairs. Such a table could be implemented using a variety of
data structures, e.g., as a pair of arrays or lists representing the words and corresponding numbers,
as an array of structures, etc. But in this case, the associative array is ideally suited to the task:
a = Assoc_Type[Int_Type, 0];
while (get_word (&word))
a[word]++;
Note the conciseness of the above code. It is important to appreciate the fact that S-Lang is a byte-
compiled interpreter that executes statements much slower than that of a language that compiles
to machine code. The overhead of the processing of byte-codes by the interpreter may be used to
roughly justify the rule of thumb that the smaller the code is, the faster it will run.
When possible, always take advantage of S-Lang's powerful array facilities. For example, consider
the act of clipping an array by setting all values greater than 10 to 10. Rather than coding this as
n = length(a);
for (i = 0; i < n; i++)
if (a[i] > 10) a[i] = 10;
it should be written as
a[where(a>10)] = 10;
Finally, do not overlook the specialized modules that are available for S-Lang.
123
124 Chapter 20. Proling
is essentially a front-end for a set of interpreter hooks dened in a le called profile.sl, which may
be used by any application embedding S-Lang. The use of the proler will rst be demonstrated
in the context of slprof, and after that follows a discussion of how to use profile.sl for other
S-Lang applications.
(To be completed...)
Chapter 21
Regular Expressions
The S-Lang library includes a regular expression (RE) package that may be used by an applica-
tion embedding the library. The RE syntax should be familiar to anyone acquainted with regular
expressions. In this section the syntax of the S-Lang regular expressions is discussed.
NOTE: At the moment, the S-Lang regular expressions do not support UTF-8 encoded strings.
The S-Lang library will most likely migrate to the use of the PCRE regular expression library,
deprecating the use of the S-Lang REs in the process. For these reasons, the user is encouraged to
125
126 Chapter 21. Regular Expressions
"\(\<[a-zA-Z]+\>\)[ ]+\1\>"
which matches any word repeated consecutively. Note how the grouping operators \( and \) are
used to dene the text matched by the enclosed regular expression, and then subsequently referred
to \1.
Finally, remember that when used in string literals either in the S-Lang language or in the C
language, care must be taken to "double-up" the '\' character since both languages treat it as an
escape character.
sions.
The most notable dierence is that the S-Lang regular expressions do not support the OR operator
| in expressions. This means that "a|b" or "a\|b" do not have the meaning that they have in regular
expression packages that support egrep-style expressions.
The other main dierence is that while S-Lang regular expressions support the grouping operators
\( and \ ), they are only used as a means of specifying the text that is matched. That is, the
expression
"@\([a-z]*\)@.*@\1@"
matches "xxx@abc@silly@abc@yyy", where the pattern \1 matches the text enclosed by the \(
and \) expressions. However, in the current implementation, the grouping operators are not used to
group regular expressions to form a single regular expression. Thus expression such as "\(hello\)*"
One question that comes up from time to time is why doesn't S-Lang simply employ some posix-
compatible regular expression library. The simple answer is that, at the time of this writing, none
exists that is available across all the platforms that the S-Lang library supports (Unix, VMS, OS/2,
win32, win16, BEOS, MSDOS, and QNX) and can be distributed under both the GNU licenses. It
is particularly important that the library and the interpreter support a common set of regular
This chapter describes features that were added to various 2.0 releases. For a much more complete
and detailed list of changes, see the changes.txt le that is distributed with the library.
• The break and continue statements support an optional integer that indicates how many loop
while (1)
{
loop (10)
{
break 2;
}
}
"This is a \
multiline \
string"
`This is
another multiline
string that
does not require
a \ for continuation`
127
128 Appendix A. S-Lang 2 Interpreter NEWS
• List_Type objects may be indexed using an array of indices instead of just a single scalar
index.
sumsq
Equivalent to sum(x*x).
expm1
More accurate version of exp(x)-1 for x near 0.
log1p
More accurate version of log(1+x) for x near 0.
list_to_array
Creates an array from a list.
string_matches
A convenient alternative to the string_match and string_match_nth functions.
_close
Close an integer descriptor.
_fileno
Returns the descriptor as an integer.
dup2_fd
Duplicates a le descriptor via the dup2 POSIX function.
ldexp, frexp
If x == a*2b, where 0.5<=a<1.0 then (a,b)=frexp(x), and x=ldexp(a,b).
hypot
If given a single array argument X, it returns the equivalent of sqrt(sum(X*X).
polynom
The calling interface to this function was changed and support added for arrays.
zlib
A module that wraps the popular z compression library.
A.2. What's new for S-Lang 2.1 129
fork
A module that wraps the fork, exec*, and waitpid functions.
sysconf
A module that implements interfaces to the POSIX sysconf, pathconf, and confstr func-
tions.
process.sl
The code in this le utilizes the fork module to implement the new_process function, which
allows the caller to easily create and communicate with subprocesses and pipelines.
• Qualiers have been added to the language as a convenient and powerful mechanism to pass
• The ifnot keyword was added as an alternative to !if. The use of !if has been deprecated.
• Looping constructs now support a "then" clause that will get executed if the loop runs to
completion, e.g.,
loop (20)
{
if (this ())
break; % The then clause will NOT get executed
}
then do_that ();
• A oating point array of exactly N elements may be created using the form [a:b:#N], where
• References to array elements and structure elds are now supported, e.g., &A[3], &s.foo.
wherenot(x)
Equivalent to where (not(x))
_$(str)
Evaluates strings with embedded "dollar" variables, e.g., _$("$TERM").
__push_list/__pop_list
Push list items onto the stack
prod(x)
Computes the product of an array a[0]*a[1]*...
minabs(x), maxabs(x)
Equivalent to min(abs(x)) and max(abs(x)), resp.
setsid
Create a new session (Unix).
iconv
Performs character-set conversion using the iconv library.
onig
A regular expression module using oniguruma RE library.
readascii
A exible and power ascii (as opposed to binary) data le reader.
cmdopt
A set of functions that vastly simplify the parsing of command line options.
Also a history and completion mechanism was added to the S-Lang readline interface, and as a
• slsh, the generic S-Lang interpreter, now supports and interactive command-line mode with
readline support.
A.3. What's new for S-Lang 2.0 131
file = "$HOME/src/slang-$VERSION/"$;
• Operator overloading for user-dened types. For example it is possible to dene a meaning to
• Syntactic sugar for objected-oriented style method calls. S-Lang 1 code such as
(@s.method)(s, args);
s.method(args);
This should make "object-oriented" code somewhat more readable. See also the next section
@s.method(args);
• More intrinsic functions including math functions such as hypot, atan2, floor, ceil, round,
isnan, isinf, and many more.
X = 18446744073709551615ULL;
• Performance improvements. The S-Lang 2 interpreter is about 20 percent faster for many
• Better debugging support including an interactive debugger. See the section on 19.2 (Using
important only if UTF-8 mode is in eect. If you use array indexing with functions that use
character semantics, then your code may not work properly in UTF-8 mode. For example, one
to extract that portion of a that preceeds the occurrence of b in a. This may nolonger work
in UTF-8 mode where bytes and characters are not generally the same. The correct way to
write the above is to use the substr function since it uses character semantics:
[0:-1] was used to index from the rst through the last element of an array, but outside this
context, [0:-1] was an empty array. For S-Lang 2, the meaning of such arrays is always
the same regardless of the context. Since by itself [0:-1] represents an empty array, indexing
with such an array will also produce an empty array. The behavior of scalar indices has not
Range arrays with an implied endpoint make sense only in indexing situations. Hence the
value of the endpoint can be inferred from the context. Such arrays include [*], [:-1], etc.
must be changed to
A.4. Upgrading to S-Lang 2 133
B = A[[-3:]];
Code such as
@s.foo(args);
(@s.foo)(args);
(@s.foo)(s, moreargs);
s.foo (moreargs);
ERROR_BLOCKS it should be changed to use the new exception handling model. For example,
variable e;
try (e)
{
do_something ();
.
.
}
catch RunTimeError:
{
cleanup_after_error ();
throw e.error, e.message;
}
variable e;
try (e)
{
do_something ();
.
.
}
finally
{
cleanup_after_error ();
}
It is not possible to emulate the complete semantics of the _clear_error function. However,
those semantics are awed and xing the problems associated with the use of _clear_error
was one of the primary reasons for the new exception handling model. The main problem with
the_clear_error method is that it causes execution to resume at the byte-code following the
code that triggered the error. As such, _clear_error denes no absolute resumption point. In
contrast, the try-catch exception model has well-dened points of execution. With the above
variable e;
try (e)
{
do_something ();
.
.
}
catch RunTimeError:
{
cleanup_after_error ();
}
variable e;
try (e)
{
A.4. Upgrading to S-Lang 2 135
do_something ();
.
.
}
catch RunTimeError:
{
cleanup_after_error ();
}
finally:
{
cleanup_after_error ();
}
fread
When reading Char_Type and UChar_Type objects the S-Lang 1 version of fread returned
a binary string (BString_Type if the number of characters read was greater than one, or a
U/Char_Type if the number read was one. In other words, the resulting type depended upon
how many bytes were read with no way to predict the resulting type in advance. In contrast,
when reading, e.g, Int_Type objects, fread returned an Int_Type when it read one integer,
or an array of Int_Type if more than one was read. For S-Lang 2, the behavior of fread with
respect to UChar_Type and Char_Type types was changed to have the same semantics as the
will no longer result in str being a BString_Type if nread > 1. Instead, str will now become
a Char_Type[nread] object. In order to read a specied number of bytes from a le in the
strtrans
The strtrans function has been changed to support Unicode. One ramication of this is that
when mapping from one range of characters to another, the length of the ranges must now be
equal.
str_delete_chars
This function was changed to support unicode character classes. Code such as
is now implies the deletion of all alphabetic characters from x. Previously it meant to delete
in UTF-8 mode. If you use array indexing in conjunction with these functions, then read on.
Appendix B
Copyright
The S-Lang library is distributed under the terms of the GNU General Public License.
Preamble
The licenses for most software are designed to take away your freedom to share and change it. By
contrast, the GNU General Public License is intended to guarantee your freedom to share and change
free softwareto make sure the software is free for all its users. This General Public License applies
to most of the Free Software Foundation's software and to any other program whose authors commit
to using it. (Some other Free Software Foundation software is covered by the GNU Library General
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses
are designed to make sure that you have the freedom to distribute copies of free software (and charge
for this service if you wish), that you receive source code or can get it if you want it, that you can
change the software or use pieces of it in new free programs; and that you know you can do these
things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or
to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give
the recipients all the rights that you have. You must make sure that they, too, receive or can get
the source code. And you must show them these terms so they know their rights.
137
138 Appendix B. Copyright
We protect your rights with two steps: (1) copyright the software, and (2) oer you this license
which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that
there is no warranty for this free software. If the software is modied by someone else and passed
on, we want its recipients to know that what they have is not the original, so that any problems
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger
that redistributors of a free program will individually obtain patent licenses, in eect making the
program proprietary. To prevent this, we have made it clear that any patent must be licensed for
The precise terms and conditions for copying, distribution and modication follow.
0. This License applies to any program or other work which contains a notice placed by the copyright
holder saying it may be distributed under the terms of this General Public License. The "Program",
below, refers to any such program or work, and a "work based on the Program" means either the
Program or any derivative work under copyright law: that is to say, a work containing the Program
or a portion of it, either verbatim or with modications and/or translated into another language.
(Hereinafter, translation is included without limitation in the term "modication".) Each licensee
is addressed as "you".
Activities other than copying, distribution and modication are not covered by this License; they
are outside its scope. The act of running the Program is not restricted, and the output from the
Program is covered only if its contents constitute a work based on the Program (independent of
having been made by running the Program). Whether that is true depends on what the Program
does.
1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any
medium, provided that you conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and
to the absence of any warranty; and give any other recipients of the Program a copy of this License
You may charge a fee for the physical act of transferring a copy, and you may at your option oer
2. You may modify your copy or copies of the Program or any portion of it, thus forming a work
based on the Program, and copy and distribute such modications or work under the terms of
Section 1 above, provided that you also meet all of these conditions:
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
B.1. The GNU Public License 139
These requirements apply to the modied work as a whole. If identiable sections of that work are
not derived from the Program, and can be reasonably considered independent and separate works
in themselves, then this License, and its terms, do not apply to those sections when you distribute
them as separate works. But when you distribute the same sections as part of a whole which is a
work based on the Program, the distribution of the whole must be on the terms of this License,
whose permissions for other licensees extend to the entire whole, and thus to each and every part
Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely
by you; rather, the intent is to exercise the right to control the distribution of derivative or collective
In addition, mere aggregation of another work not based on the Program with the Program (or with
a work based on the Program) on a volume of a storage or distribution medium does not bring the
3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code
or executable form under the terms of Sections 1 and 2 above provided that you also do one of the
following:
The source code for a work means the preferred form of the work for making modications to it. For
an executable work, complete source code means all the source code for all modules it contains, plus
140 Appendix B. Copyright
any associated interface denition les, plus the scripts used to control compilation and installation
of the executable. However, as a special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary form) with the major components
(compiler, kernel, and so on) of the operating system on which the executable runs, unless that
If distribution of executable or object code is made by oering access to copy from a designated place,
then oering equivalent access to copy the source code from the same place counts as distribution
of the source code, even though third parties are not compelled to copy the source along with the
object code.
4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided
under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License. However, parties who have
received copies, or rights, from you under this License will not have their licenses terminated so long
5. You are not required to accept this License, since you have not signed it. However, nothing else
grants you permission to modify or distribute the Program or its derivative works. These actions
are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the
Program (or any work based on the Program), you indicate your acceptance of this License to do so,
and all its terms and conditions for copying, distributing or modifying the Program or works based
on it.
6. Each time you redistribute the Program (or any work based on the Program), the recipient
automatically receives a license from the original licensor to copy, distribute or modify the Program
subject to these terms and conditions. You may not impose any further restrictions on the recipients'
exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties
to this License.
7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason
(not limited to patent issues), conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not excuse you from the conditions
of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may not distribute the
Program at all. For example, if a patent license would not permit royalty-free redistribution of the
Program by all those who receive copies directly or indirectly through you, then the only way you
could satisfy both it and this License would be to refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under any particular circumstance, the
balance of the section is intended to apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any patents or other property right
claims or to contest validity of any such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is implemented by public license practices.
Many people have made generous contributions to the wide range of software distributed through
decide if he or she is willing to distribute software through any other system and a licensee cannot
This section is intended to make thoroughly clear what is believed to be a consequence of the rest
of this License.
8. If the distribution and/or use of the Program is restricted in certain countries either by patents
or by copyrighted interfaces, the original copyright holder who places the Program under this Li-
cense may add an explicit geographical distribution limitation excluding those countries, so that
distribution is permitted only in or among countries not thus excluded. In such case, this License
9. The Free Software Foundation may publish revised and/or new versions of the General Public
License from time to time. Such new versions will be similar in spirit to the present version, but
Each version is given a distinguishing version number. If the Program species a version number
of this License which applies to it and "any later version", you have the option of following the
terms and conditions either of that version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of this License, you may choose any
10. If you wish to incorporate parts of the Program into other free programs whose distribution
conditions are dierent, write to the author to ask for permission. For software which is copyrighted
by the Free Software Foundation, write to the Free Software Foundation; we sometimes make ex-
ceptions for this. Our decision will be guided by the two goals of preserving the free status of all
derivatives of our free software and of promoting the sharing and reuse of software generally.
NO WARRANTY
If you develop a new program, and you want it to be of the greatest possible use to the public,
the best way to achieve this is to make it free software which everyone can redistribute and change
To do so, attach the following notices to the program. It is safest to attach them to the start of each
source le to most eectively convey the exclusion of warranty; and each le should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) 19yy <name of author>
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this when it starts in an interactive
mode:
The hypothetical commands `show w' and `show c' should show the appropriate parts of the General
Public License. Of course, the commands you use may be called something other than `show w' and
`show c'; they could even be mouse-clicks or menu itemswhatever suits your program.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a
"copyright disclaimer" for the program, if necessary. Here is a sample; alter the names:
This General Public License does not permit incorporating your program into proprietary programs.
If your program is a subroutine library, you may consider it more useful to permit linking proprietary
applications with the library. If this is what you want to do, use the GNU Library General Public
THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY
OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE
COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR
ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY
DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE
OF THE DATA FILES OR SOFTWARE.