SML chapter2
SML chapter2
Most functional languages are interactive. If you enter an expression, the com-
puter immediately evaluates it and displays the result. Interaction is fun; it gives
immediate feedback; it lets you develop programs in easily managed pieces.
We can enter an expression followed by a semicolon . . .
2+2;
. . . and ML responds
> 4 : int
Here we see some conventions that will be followed throughout the book. Most
ML systems print a prompt character when waiting for input; here, the input
is shown in typewriter characters. The response is shown, in slanted
characters,
> on a line like this.
At its simplest, ML is just a calculator. It has integers, as shown above, and real
numbers. ML can do simple arithmetic . . .
3.2 - 2.3;
> 0.9 : real
Again, anything typed to ML must end with a semicolon (;). ML has printed the
value and type. Note that real is the type of real numbers, while int is the type
of integers.
Interactive program development is more difficult with procedural languages
because they are too verbose. A self-contained program is too long to type as a
single input.
17
18 2 Names, Functions and Types
Chapter outline
This chapter introduces Standard ML and functional programming. The
basic concepts include declarations, simple data types, record types, recursive
functions and polymorphism. Although this material is presented using Standard
ML , it illustrates general principles.
The chapter contains the following sections:
Value declarations. Value and function declarations are presented using ele-
mentary examples.
Numbers, character strings and truth values. The built-in types int, real ,
char , string and bool support arithmetic, textual and logical operations.
Pairs, tuples and records. Ordered pairs and tuples allow functions to have
multiple arguments and results.
The evaluation of expressions. The difference between strict evaluation and
lazy evaluation is not just a matter of efficiency, but concerns the very meaning
of expressions.
Writing recursive functions. Several worked examples illustrate the use of
recursion.
Local declarations. Using let or local, names can be declared with a
restricted scope.
Introduction to modules. Signatures and structures are introduced by devel-
oping a generic treatment of arithmetic operations.
Polymorphic type checking. The principles of polymorphism are introduced,
including type inference and polymorphic functions.
Value declarations
A declaration gives something a name. ML has many kinds of things
that can be named: values, types, signatures, structures and functors. Most
names in a program stand for values, like numbers, strings — and functions.
Although functions are values in ML, they have a special declaration syntax.
The value declaration begins with ML keyword val and ends with a semicolon.
2.2 Declaring functions 19
Names in this book usually appear in italics. ML repeats the name, with its value
and type:
> val seconds = 60 : int
Let us declare constants for minutes per hour and hours per day:
val minutes = 60;
> val minutes = 60 : int
val hours = 24;
> val hours = 24 : int
If you enter an expression at top level like this, ML stores the value under the
name it. By referring to it you can use the value in a further calculation:
it div 24;
> 3600 : int
The name it always has the value of the last expression typed at top level. Any
previous value of it is lost. To save the value of it, declare a permanent name:
val secsinhour = it;
> val secsinhour = 3600 : int
fun area (r ) = pi *r *r ;
The keyword fun starts the function declaration, while area is the function
name, r is the formal parameter, and pi *r *r is the body. The body refers to r
and to the constant pi declared above.
Because functions are values in ML, a function declaration is a form of value
declaration, and so ML prints the value and type:
> val area = fn : real -> real
The type, which in standard mathematical notation is real → real , says that
area takes a real number as argument and returns another real number. The
value of a function is printed as fn. In ML, as in most functional languages,
functions are abstract values: their internal structure is hidden.
Let us call the function, repeating the area calculation performed above:
area(2.0);
> 12.56636 : real
Let us try it with a different argument. Observe that the parentheses around the
argument are optional:
area 1.0;
> 3.14159 : real
Comments. Programmers often imagine that their creations are too transparent
to require further description. This logical clarity will not be evident to others
unless the program is properly commented. A comment can describe the pur-
pose of a declaration, give a literature reference, or explain an obscure matter.
Needless to say, comments must be correct and up-to-date.
A comment in Standard ML begins with (* and ends with *), and may extend
over several lines. Comments can even be nested. They can be inserted almost
anywhere:
fun area r = (*area of circle with radius r*)
pi *r *r ;
Functional programmers should not feel absolved from writing comments. Peo-
ple once claimed that Pascal was self-documenting.
2.3 Identifiers in Standard ML 21
Redeclaring a name. Value names are called variables. Unlike variables in im-
perative languages, they cannot be updated. But a name can be reused for an-
other purpose. If a name is declared again then the new meaning is adopted
afterwards, but does not affect existing uses of the name. Let us redeclare the
constant pi :
val pi = 0.0;
> val pi = 0.0 : real
At this point in the session, several variables have values. These include seconds,
minutes, area and pi , as well as the built-in operations provided by the library.
The set of bindings visible at any point is called the environment. The function
area refers to an earlier environment in which pi denotes 3.14159. Thanks to
the permanence of names (called static binding), redeclaring a function cannot
damage the system, the library or your program.
Correcting your program. Because of static binding, redeclaring a function
called by your program may have no visible effect. When modifying a pro-
gram, be sure to recompile the entire file. Large programs should be divided into
modules; Chapter 7 will explain this in detail. After the modified module has been
recompiled, the program merely has to be relinked.
The case of letters matters, so q differs from Q. Prime characters are allowed
because ML was designed by mathematicians, who like variables called x , x 0 ,
x 00 . When choosing names, be certain to avoid ML’s keywords:
abstype and andalso as case datatype do
else end eqtype exception fn fun functor
handle if in include infix infixr let local
nonfix of op open orelse raise rec
sharing sig signature struct structure
then type val where while with withtype
Watch especially for the short ones: as, fn, if, in, of, op.
22 2 Names, Functions and Types
Certain strings of special characters are reserved for ML’s syntax and should not
be used as symbolic names:
: | = => -> # :>
Exercise 2.1 On your computer, learn how to start an ML session and how to
terminate it. Then learn how to make the ML compiler read declarations from a
file — a typical command is use "myfile".
2.4 Arithmetic
ML distinguishes between integers (type int) and real numbers (type
real ). Integer arithmetic is exact (with unlimited precision in some ML systems)
while real arithmetic is only as accurate as the computer’s floating-point hard-
ware.
Integer operations include addition (+), subtraction (-), multiplication (*), di-
vision (div) and remainder (mod). These are infix operators with conventional
precedences: thus in
2.4 Arithmetic 23
The ending En means ‘times the nth power of 10.’ A negative exponent begins
with the unary minus sign (˜). Thus 123.4E˜2 denotes 1.234.
Negative real numbers begin with unary minus (˜). Infix operators for reals
include addition (+), subtraction (-), multiplication (*) and division (/). Func-
tion application binds more tightly than infix operators. For instance, area a + b
is equivalent to (area a) + b, not area (a + b).
Unary plus and minus. The unary minus sign is a tilde (˜). Do not confuse it
with the subtraction sign (-)! ML has no unary plus sign. Neither + nor - may
appear in the exponent of a real number.
Type constraints. ML can deduce the types in most expressions from the types of
the functions and constants in it. But certain built-in functions are overloaded,
having more than one meaning. For example, + and * are defined for both
integers and reals. The type of an overloaded function must be determined from
the context; occasionally types must be stated explicitly.
For instance, ML cannot tell whether this squaring function is intended for
integers or reals, and therefore rejects it.
fun square x = x *x ;
> Error- Unable to resolve overloading for *
Suppose the function is intended for real numbers. We can insert the type real
in a number of places.
We can specify the type of the argument:
fun square(x : real ) = x *x ;
> val square = fn : real -> real
Type constraints can also appear within the body, indeed almost anywhere.
Default overloading. The standard library introduces the notion of a default
overloading; the compiler may resolve the ambiguity in square by choosing
type int. Using a type constraint in such cases is still advisable, for clarity. The motiva-
tion for default overloadings is to allow different precisions of numbers to coexist. For
example, unless the precision of 1.23 is determined by its context, it will be assumed
to have the default precision for real numbers. As of this writing there is no experience
of using different precisions, but care is plainly necessary.
Arithmetic and the standard library. The standard library includes numerous
functions for integers and reals, of various precisions. Structure Int contains
such functions as abs (absolute value), min, max and sign. Here are some examples:
Int.abs ˜4;
> 4 : int
Int.min(7, Int.sign 12);
> 1 : int
Structure Real contains analogous functions such as abs and sign, as well as functions
to convert between integers and reals. Calling real (i ) converts i to the equivalent real
number. Calling round (r ) converts r to the nearest integer. Other real-to-integer con-
versions include floor , ceil and trunc. Conversion functions are necessary whenever
integers and reals appear in the same expression.
Structure Math contains higher mathematical functions on real numbers, such as sqrt,
sin, cos, atan (inverse tangent), exp and ln (natural logarithm). Each takes one real
argument and returns a real result.
Exercise 2.2 A Lisp hacker says: ‘Since the integers are a subset of the real
numbers, the distinction between them is wholly artificial — foisted on us by
hardware designers. ML should simply provide numbers, as Lisp does, and au-
tomatically use integers or reals as appropriate.’ Do you agree? What consider-
ations are there?
The built-in function size returns the number of characters in a string. Here it
refers to "Fair Ophelia":
size (it);
> 12 : int
The space character counts, of course. The empty string contains no characters;
size("") is 0.
Here is a function that makes noble titles:
fun title(name) = "The Duke of " ˆ name;
> val title = fn : string -> string
title "York";
> "The Duke of York" : string
Special characters. Escape sequences, which begin with a backslash (\), insert
certain special characters into a string. Here are some of them:
• \n inserts a newline character (line break).
• \t inserts a tabulation character.
• \" inserts a double quote.
• \\ inserts a backslash.
• \ followed by a newline and other white-space characters, followed by
another \ inserts nothing, but continues a string across the line break.
Here is a string containing newline characters:
"This above all:\nto thine own self be true\n";
The type char. Just as the number 3 differs from the set {3}, a character differs
from a one-character string. Characters have type char . The constants have the
form #s, where s is a string constant consisting of a single character. Here is a
letter, a space and a special character:
#"a" #" " #"\n"
The functions ord and chr convert between characters and character codes.
Most implementations use the ASCII character set; if k is in the range 0 ≤
26 2 Names, Functions and Types
k ≤ 255 then chr (k ) returns the character with code k . Conversely, ord (c)
is the integer code of the character c. We can use these to convert a number
between 0 and 9 to a character between #"0" and #"9":
fun digit i = chr (i + ord #"0");
> val digit = fn : int -> char
The functions str and String.sub convert between characters and strings. If c
is a character then str (c) is the corresponding string. Conversely, if s is a string
then String.sub(s, n) returns the nth character in s, counting from zero. Let
us try these, first expressing the function digit differently:
fun digit i = String.sub("0123456789", i);
> val digit = fn : int -> char
str (digit 5);
> "5" : string
The second definition of digit is preferable to the first, as it does not rely on
character codes.
Strings, characters and the standard library. Structure String contains numer-
ous operations on strings. Structure Char provides functions such as isDigit,
isAlpha, etc., to recognize certain classes of character. A substring is a contiguous
subsequence of characters from a string; structure Substring provides operations for
extracting and manipulating them.
The ML Definition only has type string (Milner et al., 1990). The standard library
introduces the type char . It also modifies the types of built-in functions such as ord
and chr , which previously operated on single-character strings.
Exercise 2.4 For each version of digit, what do you expect in response to the
calls digit ˜1 and digit 10? Try to predict the response before experimenting
on the computer.
is that of E1 if E equals true, and that of E2 if E equals false. The else part
is mandatory.
The simplest tests are the relations:
These are defined on integers and reals; they also test alphabetical ordering on
strings and characters. Thus the relations are overloaded and may require type
constraints. Equality (=) and its negation (<>) can be tested for most types.
For example, the function sign computes the sign (1, 0, or −1) of an integer.
It has two conditional expressions and a comment.
fun sign(n) =
if n>0 then 1
else if n=0 then 0
else (*n<0*) ˜1;
> val sign = fn : int ->int
Functions that return a boolean value are known as predicates. Here is a predi-
cate to test whether its argument, a character, is a lower-case letter:
When a conditional expression is evaluated, either the then or the else ex-
pression is evaluated, never both. The boolean operators andalso and orelse
behave differently from ordinary functions: the second operand is evaluated only
if necessary. Their names reflect this sequential behaviour.
The vector’s type, which in mathematical notation is real × real , is the type of a
pair of real numbers. Vectors are ML values and can be given names. We declare
the zero vector and two others, called a and b.
val zerovec = (0.0, 0.0);
> val zerovec = (0.0, 0.0) : real * real
val a = (1.5, 6.8);
> val a = (1.5, 6.8) : real * real
val b = (3.6, 0.9);
2.7 Vectors: an example of pairing 29
The function lengthvec takes the pair of values of x and y. It has type real ×
real → real : its argument is a pair of real numbers and its result is another real
number.2 Here, a is a pair of real numbers.
lengthvec a;
> 6.963476143 : real
lengthvec (1.0, 1.0);
> 1.414213562 : real
Function negvec negates a vector with respect to the point (0, 0).
fun negvec (x ,y) : real *real = (˜x , ˜y);
> val negvec = fn : real * real -> real * real
This function has type real × real → real × real : given a pair of real numbers it
returns another pair. The type constraint real × real is necessary because minus
(˜) is overloaded.
We negate some vectors, giving a name to the negation of b:
negvec (1.0, 1.0);
> (˜1.0, ˜1.0) : real * real
val bn = negvec(b);
> val bn = (˜3.6, ˜0.9) : real * real
Vectors can be arguments and results of functions and can be given names. In
short, they have all the rights of ML’s built-in values, like the integers. We can
even declare a type of vectors:
type vec = real *real ;
> type vec
Now vec abbreviates real × real . It is only an abbreviation though: every pair
of real numbers has type vec, regardless of whether it is intended to represent a
vector. We shall employ vec in type constraints.
2 function Math.sqrt, which is defined only for real numbers, constrains the
overloaded operators to type real .
30 2 Names, Functions and Types
This would be an odd thing to do to a vector, but average works for any two
numbers:
average(3.1,3.3);
> 3.2 : real
The sum of vectors (x1 , y1 ) and (x2 , y2 ) is (x1 + x2 , y1 + y2 ). In ML, this function
takes a pair of vectors. Its argument pattern is a pair of pairs:
fun addvec ((x 1,y1), (x 2,y2)) : vec = (x 1+x 2, y1+y2);
> val addvec = fn : (real * real) * (real * real) -> vec
Type vec appears for the first time, constraining addition to operate on real num-
bers. ML gives addvec the type
which is equivalent to the more concise (vec × vec) → vec. The ML system
may not abbreviate every real × real as vec.
Look again at the argument pattern of addvec. We may equivalently view this
function as taking
Here we add the vectors (8.9,4.4) and b, then add the result to another vector.
Note that vec is the result type of the function.
addvec((8.9, 4.4), b);
> (12.5, 5.3) : vec
addvec(it, (0.1, 0.2));
> (12.6, 5.5) : vec
The variable pairv ranges over pairs of vectors. This version may look odd, but
is equivalent to its predecessor. How far is it from a to b?
distance(a,b);
> 6.262587325 : real
A final example will show that the components of a pair can have different types:
here, a real number and a vector. Scaling a vector means multiplying both com-
ponents by a constant.
fun scalevec (r , (x ,y)) : vec = (r *x , r *y);
> val scalevec = fn : real * (real * real) -> vec
The type constraint vec ensures that the multiplications apply to reals. The
function scalevec takes a real number and a vector, and returns a vector.
scalevec(2.0, a);
> (3.0, 13.6) : vec
scalevec(2.0, it);
> (6.0, 27.2) : vec
32 2 Names, Functions and Types
The 0-tuple and the type unit. Previously we have considered n-tuples for n ≥
2. There is also a 0-tuple, written () and pronounced ‘unity,’ which has no
components. It serves as a placeholder in situations where no data needs to be
conveyed. The 0-tuple is the sole value of type unit.
Type unit is often used with procedural programming in ML. A procedure is
typically a ‘function’ whose result type is unit. The procedure is called for its
effect — not for its value, which is always (). For instance, some ML systems
provide a function use of type string → unit. Calling use "myfile" has the
effect of reading the definitions on the file "myfile" into ML.
A function whose argument type is unit passes no information to its body
when called. Calling the function simply causes its body to be evaluated. In
Chapter 5, such functions are used to delay evaluation for programming with
infinite lists.
Exercise 2.6 Write a function to determine whether one time of day, in the
form (hours, minutes, AM or PM), comes before another. As an example,
(11, 59, "AM") comes before (1, 15, "PM").
Exercise 2.7 Old English money had 12 pence in a shilling and 20 shillings in
a pound. Write functions to add and subtract two amounts, working with triples
(pounds, shillings, pence).
2.9 Records 33
2.9 Records
A record is a tuple whose components — called fields — have labels.
While each component of an n-tuple is identified by its position from 1 to n,
the fields of a record may appear in any order. Transposing the components of
a tuple is a common error. If employees are taken as triples (name, age, salary)
then there is a big difference between ("Jones", 25, 15300) and ("Jones",
15300, 25). But the records
{name="Jones", age=25, salary=15300}
and
{name="Jones", salary=15300, age=25}
are equal. A record is enclosed in braces {. . . }; each field has the form label =
expression.
Records are appropriate when there are many components. Let us record five
fundamental facts about some Kings of England, and note ML’s response:
val henryV =
{name = "Henry V",
born = 1387,
crowned = 1413,
died = 1422,
quote = "Bid them achieve me and then sell my bones"};
> val henryV =
> {born = 1387,
> died = 1422,
> name = "Henry V",
> quote = "Bid them achieve me and then sell my bones",
> crowned = 1413}
> : {born: int,
> died: int,
> name: string,
> quote: string,
> crowned: int}
ML has rearranged the fields into a standard order, ignoring the order in which
they were given. The record type lists each field as label : type, within braces.
Here are two more Kings:
val henryVI =
{name = "Henry VI",
born = 1421,
crowned = 1422,
died = 1471,
quote = "Weep, wretched man, \
\ I’ll aid thee tear for tear"};
34 2 Names, Functions and Types
val richardIII =
{name = "Richard III",
born = 1452,
crowned = 1483,
died = 1485,
quote = "Plots have I laid..."};
The quote of henryVI extends across two lines, using the backslash, newline,
backslash escape sequence.
Record patterns. A record pattern with fields label = variable gives each vari-
able the value of the corresponding label. If we do not need all the fields, we
can write three dots (...) in place of the others. Here we get two fields from
Henry V’s famous record, calling them nameV and bornV :
val {name=nameV , born=bornV , ...} = henryV ;
> val nameV = "Henry V" : string
> val bornV = 1387 : int
Often we want to open up a record, making its fields directly visible. We can
specify each field in the pattern as label = label , making the variable and the
label identical. Such a specification can be shortened to simply label . We open
up Richard III:
val {name,born,died ,quote,crowned } = richardIII ;
> val name = "Richard III" : string
> val born = 1452 : int
> val died = 1485 : int
> val quote = "Plots have I laid..." : string
> val crowned = 1483 : int
To omit some fields, write (...) as before. Now quote stands for the quote of
Richard III. Obviously this makes sense for only one King at a time.
Record field selections. The selection #label gets the value of the given label
from a record.
#quote richardIII ;
> "Plots have I laid..." : string
#died henryV - #born henryV ;
> 35 : int
Different record types can have labels in common. Both employees and Kings
have a name, whether "Jones" or "Henry V". The three Kings given above
have the same record type because they have the same number of fields with the
same labels and types.
2.9 Records 35
Here is another example of different record types with some labels in com-
mon: the n-tuple (x1 , x2 , . . . , xn ) is just an abbreviation for a record with num-
bered fields:
{1 = x1 , 2 = x2 , . . . , n = xn }
Yes, a label can be a positive integer! This obscure fact about Standard ML is
worth knowing for one reason: the selector #k gets the value of component k
of an n-tuple. So #1 selects the first component and #2 selects the second. If
there is a third component then #3 selects it, and so forth:
#2 ("a","b",3,false);
> "b" : string
Partial record specifications. A field selection that omits some of the fields
does not completely specify the record type; a function may only be defined
over a complete record type. For instance, a function cannot be defined for all records
that have fields born and died , without specifying the full set of field names (typically
using a type constraint). This restriction makes ML records efficient but inflexible. It
applies equally to record patterns and field selections of the form #label . Ohori (1995)
has defined and implemented flexible records for a variant of ML.
Declaring a record type. Let us declare the record type of Kings. This abbrevi-
ation will be useful for type constraints in functions.
type king = {name : string,
born : int,
crowned : int,
died : int,
quote : string};
> type king
We now can declare a function on type king to return the King’s lifetime:
fun lifetime(k : king) = #died k - #born k ;
> val lifetime = fn : king -> int
Either way the type constraint is mandatory. Otherwise ML will print a message
like ‘A fixed record type is needed here.’
lifetime henryV ;
> 35 : int
lifetime richardIII ;
> 33 : int
36 2 Names, Functions and Types
Exercise 2.8 Does the following function definition require a type constraint?
What is its type?
fun lifetime({name,born,crowned ,died ,quote}) = died - born;
Exercise 2.9 Discuss the differences, if any, between the selector #born and
the function
fun borna t({born}) = born;
The infix status of a name concerns only its syntax, not its value, if any. Usually
a name is made infix before it has any value at all.
Similarly, times has precedence 7 (like * in ML) and constructs a string con-
taining a * sign.
infix 7 times;
fun (a times b) = "(" ˆ a ˆ "*" ˆ b ˆ ")";
> val times = fn : string * string -> string
"m" times "n" times "3" plus "i" plus "j" times "k";
> "((((m*n)*3)+i)+(j*k))" : string
The operator pow has higher precedence than times and associates to the right,
which is traditional for raising to a power. It produces a # sign. (ML has no
operator for powers.)
infixr 8 pow ;
fun (a pow b) = "(" ˆ a ˆ "#" ˆ b ˆ ")";
> val pow = fn : string * string -> string
"m" times "i" pow "j" pow "2" times "n";
> "((m*(i#(j#2)))*n)" : string
Many infix operators have symbolic names. Let ++ be the operator for vector
addition:
infix ++;
fun ((x 1,y1) ++ (x 2,y2)) : vec = (x 1+x 2, y1+y2);
> val ++ = fn : (real * real) * (real * real) -> vec
Keep symbolic names separate. Symbolic names can cause confusion if you
run them together. Below, ML reads the characters +˜ as one symbolic name,
then complains that this name has no value:
1+˜3;
> Unknown name +˜
1+ ˜3;
> ˜2 : int
38 2 Names, Functions and Types
Infix status can be revoked. If ⊕ is an infix operator then the directive nonfix⊕
makes it revert to ordinary function notation. A subsequent infix directive can
make ⊕ an infix operator again.
Here we deprive ML’s multiplication operator of its infix status. The attempt
to use it produces an error message, since we may not apply 3 as a function. But
* can be applied as a function:
nonfix *;
3*2;
> Error: Type conflict...
*(3,2);
> 6 : int
When a function is called, the argument is substituted for the function’s formal
parameter in the body. The evaluation rules differ over when, and how many
times, the argument is evaluated. The formal parameter indicates where in the
body to substitute the argument. The name of the formal parameter has no other
significance, and no significance outside of the function definition.
This value is substituted into the body of f , which then can be evaluated. Pattern-
matching is a minor complication. If f is declared by, say
then substitute the corresponding parts of E ’s value for the pattern variables x , y
and z . (A practical implementation performs no substitutions, but instead binds
the formal parameters in the local environment.)
Consider how ML evaluates sqr (sqr (sqr (2))). Of the three function calls,
only the innermost call has a value for the argument. So sqr (sqr (sqr (2))) re-
duces to sqr (sqr (2 × 2)). The multiplication must now be evaluated, yielding
sqr (sqr (4)). Evaluating the inner call yields sqr (4 × 4), and so forth. Reduc-
40 2 Names, Functions and Types
tions are written sqr (sqr (4)) ⇒ sqr (4 × 4). The full evaluation looks like this:
sqr (sqr (sqr (2))) ⇒ sqr (sqr (2 × 2))
⇒ sqr (sqr (4))
⇒ sqr (4 × 4)
⇒ sqr (16)
⇒ 16 × 16
⇒ 256
Now consider zero(sqr (sqr (sqr (2)))). The argument of zero is the expression
evaluated above. It is evaluated but the value is ignored:
zero(sqr (sqr (sqr (2)))) ⇒ zero(sqr (sqr (2 × 2)))
..
.
⇒ zero(256)
⇒0
Such waste! Functions like zero are uncommon, but frequently a function’s
result does not depend on all of its arguments.
ML ’s evaluation rule is known as call-by-value because a function is always
given its argument’s value. It is not hard to see that call-by-value corresponds to
the usual way we should perform a calculation on paper. Almost all program-
ming languages adopt it. But perhaps we should look for an evaluation rule
that reduces zero(sqr (sqr (sqr (2)))) to 0 in one step. Before such issues can be
examined, we must have a look at recursion.
fact(4) ⇒ 4 × fact(4 − 1)
⇒ 4 × fact(3)
⇒ 4 × (3 × fact(3 − 1))
⇒ 4 × (3 × fact(2))
⇒ 4 × (3 × (2 × fact(2 − 1)))
⇒ 4 × (3 × (2 × fact(1)))
⇒ 4 × (3 × (2 × (1 × fact(1 − 1))))
⇒ 4 × (3 × (2 × (1 × fact(0))))
⇒ 4 × (3 × (2 × (1 × 1)))
⇒ 4 × (3 × (2 × 1))
⇒ 4 × (3 × 2)
⇒4×6
⇒ 24
The computer will not apply such laws unless we force it to. The function facti
keeps a running product in p, which initially should be 1:
fun facti (n,p) =
if n=0 then p else facti(n-1, n *p);
> val facti = fn : int * int -> int
Compare the evaluation for facti (4, 1), shown in Figure 2.2, with that of fact(4).
The intermediate expressions stay small; each multiplication can be done at
once; storage requirements remain constant. The evaluation is iterative — also
termed tail recursive. In Section 6.3 we shall prove that facti gives correct
results by establishing the law facti (n, p) = n! × p.
Good compilers detect iterative forms of recursion and execute them effi-
ciently. The result of the recursive call facti (n − 1, n × p) undergoes no further
computation, but is immediately returned as the value of facti (n, p). Such a tail
call can be executed by assigning the arguments n and p their new values and
then jumping back into the function, avoiding the cost of a proper function in-
vocation. The recursive call in fact is not a tail call because its value undergoes
further computation, namely multiplication by n.
Many functions can be made iterative by adding an argument, like p in facti .
Sometimes the iterative function runs much faster. Sometimes, making a func-
tion iterative is the only way to avoid running out of store. However, adding
2.12 Recursive functions under call-by-value 43
This may look plausible, but every call to badf runs forever. Observe the evalu-
ation of badf (0):
badf (0) ⇒ cond (true, 1, 0 × badf (−1))
⇒ cond (true, 1, 0 × cond (false, 1, −1 × badf (−2)))
..
.
Although cond never requires the values of all three of its arguments, the call-
by-value rule evaluates them all. The recursion cannot terminate.
Conditional and/or. ML’s boolean infix operators andalso and orelse are
not functions, but stand for conditional expressions.
The expression E1 andalso E2 abbreviates
if E1 then E2 else false.
44 2 Names, Functions and Types
These operators compute the boolean and/or, but evaluate E2 only if necessary.
If they were functions, the call-by-value rule would evaluate both arguments.
All other ML infixes are really functions.
The sequential evaluation of andalso and orelse makes them ideal for
expressing recursive predicates (boolean-valued functions). The function pow-
oftwo tests whether a number is a power of two:
fun even n = (n mod 2 = 0);
> val even = fn : int -> bool
fun powoftwo n = (n=1) orelse
(even(n) andalso powoftwo(n div 2));
> val powoftwo = fn : int -> bool
This is the call-by-name rule. It reduces zero(sqr (sqr (sqr (2)))) at once to 0.
But it does badly by sqr (sqr (sqr (2))). It duplicates the argument, sqr (sqr (2)).
The result of this ‘reduction’ is
sqr (sqr (2)) × sqr (sqr (2)).
This happens because sqr (x ) = x × x .
Multiplication, like other arithmetic operations, needs special treatment. It
must be applied to values, not expressions: it is an example of a strict function.
To evaluate E1 × E2 , the expressions E1 and E2 must be evaluated first.
Let us carry on with the evaluation. As the outermost function is ×, which is
strict, the rule selects the leftmost call to sqr . Its argument is also duplicated:
(sqr (2) × sqr (2)) × sqr (sqr (2))
A full evaluation goes something like this.
sqr (sqr (sqr (2))) ⇒ sqr (sqr (2)) × sqr (sqr (2))
⇒ (sqr (2) × sqr (2)) × sqr (sqr (2))
⇒ ((2 × 2) × sqr (2)) × sqr (sqr (2))
⇒ (4 × sqr (2)) × sqr (sqr (2))
⇒ (4 × (2 × 2)) × sqr (sqr (2))
..
.
Does it ever reach the answer? Eventually. But call-by-name cannot be the
evaluation rule we want.
The call-by-need rule (lazy evaluation) is like call-by-name, but ensures that
each argument is evaluated at most once. Rather than substituting an expression
into the function’s body, the occurrences of the argument are linked by pointers.
If the argument is ever evaluated, the value will be shared with its other occur-
rences. The pointer structure forms a directed graph of functions and arguments.
As a part of the graph is evaluated, it is updated by the resulting value. This is
called graph reduction.
Figure 2.3 presents a graph reduction. Every step replaces an occurrence of
sqr (E ) by E × E , where the two E s are shared. There is no wasteful duplica-
tion: only three multiplications are performed. We seem to have the best of both
worlds, for zero(E ) reduces immediately to 0. But the graph manipulations are
expensive.
46 2 Names, Functions and Types
⇒ ⇒
sqr ×
sqr sqr
sqr sqr
2 2
⇒ ⇒
× ×
× ×
sqr ×
2 2
⇒ ⇒
× × 256
× 16
4
2.13 Call-by-need, or lazy evaluation 47
A comparison of strict and lazy evaluation. Call-by-need does the least possible
evaluation. It may seem like the route to efficiency. But it requires much book-
keeping. Realistic implementations became possible only after David Turner
(1979) applied graph reduction to combinators. He exploited obscure facts
about the λ-calculus to develop new compilation techniques, which researchers
continue to improve. Every new technology has its evangelists: some people
are claiming that lazy evaluation is the way, the truth and the light. Why does
Standard ML not adopt it?
Lazy evaluation says that zero(E ) = 0 even if E fails to terminate. This flies
in the face of mathematical tradition: an expression is meaningful only if all its
parts are. Alonzo Church, the inventor of the λ-calculus, preferred a variant (the
λI -calculus) banning constant functions like zero.
Infinite data structures complicate mathematical reasoning. To fully under-
stand lazy evaluation, it is necessary to know some domain theory, as well as
the theory of the λ-calculus. The output of a program is not simply a value,
but a partially evaluated expression. These concepts are not easy to learn, and
many of them are mechanistic. If we can only think in terms of the evaluation
mechanism, we are no better off than the procedural programmers.
Efficiency is problematical too. Sometimes lazy evaluation saves enormous
48 2 Names, Functions and Types
amounts of space; sometimes it wastes space. Recall that facti is more efficient
than fact under strict evaluation, performing each multiplication at once. Lazy
evaluation of facti (n, p) evaluates n immediately (for the test n = 0), but not
p. The multiplications accumulate; we have a space leak (Figure 2.4).
Most lazy programming languages are purely functional. Can lazy evaluation
be combined with commands, such as are used in ML to perform input/output?
Subexpressions would be evaluated at unpredictable times; it would be impos-
sible to write reliable programs. Much research has been directed at combining
functional and imperative programming (Peyton Jones and Wadler, 1993).
The Greatest Common Divisor of two integers is by definition the greatest inte-
ger that divides both. Euclid’s Algorithm is correct because the divisors of m
and n are the same as those of m and n − m, and, by repeated subtraction, the
same as the divisors of m and n mod m. Regarding its efficiency, consider
Euclid’s Algorithm dates from antiquity. We seldom can draw on 2000 years of
expertise, but we should aim for equally elegant and efficient solutions.
Recursion involves reducing a problem to smaller subproblems. The key to
efficiency is to select the right subproblems. There must not be too many of
them, and the rest of the computation should be reasonably simple.
x 10 = (x 5 )2 = (x × x 4 )2 = (x × (x 2 )2 )2
Note how mod tests whether the exponent is even. Integer division (div ) trun-
cates its result to an integer if k is odd. The function power embodies the equa-
tions (for n > 0)
x1 = x
x 2n = (x 2 )n
x 2n +1 = x × (x 2 )n .
Exercise 2.12 Write the computation steps for power (2.0, 29).
Exercise 2.13 How many multiplications does power (x , k ) need in the worst
case?
Exercise 2.14 Why not take k = 0 for the base case instead of k = 1?
50 2 Names, Functions and Types
F0 = 0
F1 = 1
Fn = Fn −2 + Fn −1 . for n ≥ 2
F8 = F6 + F7 = F6 + (F5 + F6 ),
it computes F6 twice.
Each Fibonacci number is the sum of the previous two:
The special name it, by referring to the previous pair, helps us demonstrate the
function:
nextfib (0,1);
> (1, 1) : int * int
nextfib it;
> (1, 2) : int * int
nextfib it;
> (2, 3) : int * int
nextfib it;
> (3, 5) : int * int
It quickly computes (F29 , F30 ), which previously would have required nearly
three million function calls:
2.15 Fibonacci numbers 51
fibpair 30;
> (514229, 832040) : int * int
Let us consider in detail why fibpair is correct. Clearly fibpair (1) = (F0 , F1 ).
And if, for n ≥ 1, we have
fibpair (n) = (Fn −1 , Fn ),
then
fibpair (n + 1) = (Fn , Fn −1 + Fn ) = (Fn , Fn +1 ).
We have just seen a proof of the formula fibpair (n) = (Fn −1 , Fn ) by mathe-
matical induction. We shall see many more examples of such proofs in Chap-
ter 6. Proving properties of functional programs is often straightforward; this is
one of the main advantages of functional languages.
The function fibpair uses a correct and fairly efficient algorithm for comput-
ing Fibonacci numbers, and it illustrates computing with pairs. But its pattern
of recursion wastes space: fibpair builds the nest of calls
nextfib(nextfib(· · · nextfib(0, 1) · · · )).
To make the algorithm iterative, let us turn the computation inside out:
fun itfib (n, prev , curr ) : int =
if n=1 then curr (*does not work for n=0*)
else itfib (n-1, curr , prev +curr );
> val itfib = fn : int * int * int -> int
Exercise 2.16 Show that the number of steps needed to compute Fn by its
recursive definition is exponential in n. How many steps does fib perform?
Assume that call-by-value is used.
There are faster methods of computing square roots, but ours is respectably fast
and is a simple demonstration of recursion.
introot 123456789;
> 11111 : int
it *it;
> 123454321 : int
introot 2000000000000000000000000000000;
2.16 Integer square roots 53
Exercise 2.18 Code this integer square root algorithm using iteration in a pro-
cedural programming language.
Local declarations
Reducing the fraction n/d to least terms, where n and d have no com-
mon factor, involves dividing both numbers by their GCD.
fun fraction (n,d ) = (n div gcd (n,d ), d div gcd (n,d ));
But this is a contorted way of giving gcd (n, d ) the name com. ML allows the
declaration of names within an expression:
fun fraction (n,d ) =
let val com = gcd (n,d )
in (n div com, d div com) end;
> val fraction = fn : int * int -> int * int
let D in E end
created is visible only inside the let expression. Then the expression E is
evaluated, and its value returned.
Typically D is a compound declaration, which consists of a list of declara-
tions:
D1 ; D2 ; . . . ; Dn
The effect of each declaration is visible in subsequent ones. The semicolons are
optional and many programmers omit them.
Nested function declarations. Our square root function is still not ideal. The
arguments a and acc are passed unchanged in every recursive call of findroot.
They can be made global to findroot for efficiency and clarity.
A further let declaration nests findroot within sqroot. The accuracy acc is
declared first, to be visible in findroot; the argument a is also visible.
fun sqroot a =
let val acc = 1.0E˜10
2.18 Hiding declarations using local 55
fun findroot x =
let val nextx = (a/x + x ) / 2.0
in if abs (x -nextx ) < acc *x
then nextx else findroot nextx
end
in findroot 1.0 end;
> val sqroot = fn : real -> real
When not to use let. Consider taking the minimum of f (x ) and g(x ). You
could name these quantities using let:
let val a = f x
val b = g x
in
if a<b then a else b
end
Exercise 2.20 Above we have used local to hide the function itfib. Why not
simply nest the declaration of itfib within fib? Compare with the treatment of
findroot and sqroot.
Exercise 2.21 Using let, we can eliminate the expensive squaring operation
in our integer square root function. Code a variant of introot that maps n to its
integer square root k , paired with the difference n − k 2 . Only simple multipli-
cations and divisions are needed; an optimizing compiler could replace them by
bit operations.
evaluates the expressions E1 , . . ., En and then declares the identifiers Id1 , . . ., Idn
to have the corresponding values. Since the declarations do not take effect until
all the expressions are evaluated, their order is immaterial.
Here we declare names for π, e and the logarithm of 2.
val pi = 4.0 * Math.atan 1.0
and e = Math.exp 1.0
and log2 = Math.ln 2.0;
> pi = 3.141592654 : real
> e = 2.718281828 : real
> log2 = 0.693147806 : real
> val five = "BONG BONG BONG BONG BONG " : string
This is, of course, a silly thing to do! But it illustrates that the declarations occur
at the same time. Consecutive declarations would give one and three identical
bindings.
π 1 1 1 1 1
= 1 − + − ··· + − ···
4 3 5 7 4k + 1 4k + 3
By mutual recursion, the final term of the summation can be either positive or
negative:
fun pos d = neg(d -2.0) + 1.0/d
and neg d = if d >0.0 then pos(d -2.0) - 1.0/d
else 0.0;
> val pos = fn : real -> real
> val neg = fn : real -> real
Mutually recursive functions can often be combined into one function with the
help of an additional argument:
fun sum(d ,one) =
58 2 Names, Functions and Types
Now sum(d ,1.0) returns the same value as pos(d ), and sum(d ,˜1.0)
returns the same value as
neg(d ).
For each of the labels, F , G and H , declare mutually recursive functions. The
argument of each function is a tuple holding all of the variables.
fun F (x ,y,z ) = G(x +1,y,z )
and G(x ,y,z ) = if y<z then F (x ,y,z ) else H (x ,x +y,z )
and H (x ,y,z ) = if z >0 then F (x ,y,z -x ) else (x ,y,z );
> val F = fn : int * int * int -> int * int * int
> val G = fn : int * int * int -> int * int * int
> val H = fn : int * int * int -> int * int * int
Calling f (0, 0, 0) gives x , y and z their initial values for execution, and returns
the result of the procedural code.
f (0,0,0);
> (1, 1, 0) : int * int * int
Introduction to modules
An engineer understands a device in terms of its component parts, and
those, similarly, in terms of their subcomponents. A bicycle has wheels; a wheel
has a hub; a hub has bearings, and so forth. It takes several stages before we
reach the level of individual pieces of metal and plastic. In this way one can un-
derstand the entire bike at an abstract level, or parts of it in detail. The engineer
can improve the design by modifying one part, often without thinking about the
other parts.
Programs (which are more complicated than bicycles!) should also be seen as
consisting of components. Traditionally, a subprogram is a procedure or func-
tion, but these are too small — it is like regarding the bicycle as composed of
thousands of metal shapes. Many recent languages regard programs as consist-
ing of modules, each of which defines its own data structures and associated
operations. The interface to each module is specified separately from the mod-
ule itself. Different modules can therefore be coded by different members of a
project team; the compiler can check that each module meets its interface spec-
ification.
Consider our vector example. The function addvec is useless in isolation; it
must be used together with other vector operations, all sharing the same repre-
sentation of vectors. We can guess that the other operations are related because
their names all end with vec, but nothing enforces this naming convention. They
should be combined together to form a program module.
An ML structure combines related types, values and other structures, with a
uniform naming discipline. An ML signature specifies a class of structures by
listing the name and type (or other attributes) of each component.
Standard ML’s signatures and structures have analogues in other languages,
such as Modula-2’s definition and implementation modules (Wirth, 1985). ML
also provides functors — structures taking other structures as parameters — but
we shall defer these until Chapter 7.
ber has the form x + iy, where x and y are real numbers and i is a constant
postulated to satisfy i 2 = −1. Thus, x and y determine the complex number.
The complex number zero is 0 + i 0. The sum of two complex numbers con-
sists of the sums of the x and y parts; the difference is similar. The definitions of
product and reciprocal look complicated, but are easy to justify using algebraic
laws and the axiom i 2 = −1:
(x + iy) + (x 0 + iy 0 ) = (x + x 0 ) + i (y + y 0 )
(x + iy) − (x 0 + iy 0 ) = (x − x 0 ) + i (y − y 0 )
(x + iy) × (x 0 + iy 0 ) = (xx 0 − yy 0 ) + i (xy 0 + x 0 y)
1/(x + iy) = (x − iy)/(x 2 + y 2 )
Further reading. Penrose (1989) explains the complex number system in more
detail, with plenty of motivation and examples. He discusses the connections
between the complex numbers and fractals, including a definition of the Mandelbrot
set. Later in the book, the complex numbers play a central rôle in his discussion of
quantum mechanics. Penrose gives the complex numbers a metaphysical significance;
that might be taken with a pinch of salt! Feynman et al. (1963) give a more technical
but marvellously enjoyable description of the complex numbers in Chapter 22.
2.21 Structures
Declarations can be grouped to form a structure by enclosing them in
the keywords struct and end. The result can be bound to an ML identifier
using a structure declaration:
2.21 Structures 61
structure Complex =
struct
type t = real *real ;
val zero = (0.0, 0.0);
fun sum ((x ,y), (x 0 ,y 0 )) = (x +x 0 , y+y 0 ) : t;
fun diff ((x ,y), (x 0 ,y 0 )) = (x -x 0 , y-y 0 ) : t;
fun prod ((x ,y), (x 0 ,y 0 )) = (x *x 0 - y *y 0 , x *y 0 + x 0 *y) : t;
fun recip (x ,y) =
let val t = x *x + y *y
in (x /t, ˜y/t) end
fun quo (z ,z 0 ) = prod (z , recip z 0 );
end;
In two steps we form the sum a + i + 0.7, which equals 1 + i . Finally we square
that number to obtain 2i :
val b = Complex .sum(a,i);
> val b = (0.3, 1.0) : Complex.t
Complex .sum(b, (0.7, 0.0));
> (1.0, 1.0) : Complex.t
Complex .prod (it,it);
> (0.0, 2.0) : Complex.t
Observe that Complex .t is the same type as real ×real ; what is more confusing,
it is the same type as vec above. Chapter 7 describes how to declare an abstract
type, whose internal representation is hidden.
Structures look a bit like records, but there are major differences. A record’s
components can only be values (including, perhaps, other records). A structure’s
62 2 Names, Functions and Types
components may include types and exceptions (as well as other structures). But
you cannot compute with structures: they can only be created when the program
modules are being linked together. Structures should be seen as encapsulated
environments.
2.22 Signatures
A signature is a description of each component of a structure. ML re-
sponds to our declaration of the structure Complex by printing its view of the
corresponding signature:
structure Complex = ...;
> structure Complex :
> sig
> type t
> val diff : (real * real) * (real * real) -> t
> val prod : (real * real) * (real * real) -> t
> val quo : (real * real) * (real * real) -> t
> val recip : real * real -> real * real
> val sum : (real * real) * (real * real) -> t
> val zero : real * real
> end
The keywords sig and end enclose the signature body. It shows the types of all
the components that are values, and mentions the type t. (Some compilers dis-
play eqtype t instead of type t, informing us that t is a so-called equality
type.)
The signature inferred by the ML compiler is frequently not the best one for
our purposes. The structure may contain definitions that ought to be kept private.
By declaring our own signature and omitting the private names, we can hide
them from users of the structure. We might, for instance, hide the name recip.
The signature printed above expresses the type of complex numbers some-
times as t and sometimes as real × real . If we use t everywhere, then we obtain
a general signature that specifies a type t equipped with operators sum, prod ,
etc.:
signature ARITH =
sig
type t
val zero : t
val sum : t * t -> t
val diff : t * t -> t
val prod : t * t -> t
val quo : t * t -> t
end;
2.22 Signatures 63
The declaration gives the name ARITH to the signature enclosed within the
brackets sig and end. We can declare other structures and make them con-
form to signature ARITH . Here is the skeleton of a structure for the rational
numbers:
structure Rational : ARITH =
struct
type t = int *int;
val zero = (0, 1);
..
.
end;
Exercise 2.24 Declare a structure Real , matching signature ARITH, such that
Real .t is the type real and the components zero, sum, prod , etc., denote the
corresponding operations on type real .
• Weakly typed languages like Lisp and Prolog give programmers the
freedom they need when writing large programs.
• Strongly typed languages like Pascal give programmers security by re-
stricting their freedom to make mistakes.
Polymorphic type checking offers a new position: the security of strong type
checking, as well as great flexibility. Programs are not cluttered with type spec-
ifications since most type information is deduced automatically.
A type denotes a collection of values. A function’s argument type specifies
which values are acceptable as arguments. The result type specifies which values
could be returned as results. Thus, div demands a pair of integers as argument;
its result can only be an integer. If the divisor is zero then there will be no result
at all: an error will be signalled instead. Even in this exceptional situation, the
function div is faithful to its type.
ML can also assign a type to the identity function, which returns its argument
unchanged. Because the identity function can be applied to an argument of any
type, it is polymorphic. Generally speaking, an object is polymorphic if it can be
regarded as having multiple types. ML polymorphism is based on type schemes,
which are like patterns or templates for types. For instance, the identity function
has the type scheme α → α.
if E then E1 else E2
The constants 0 and 1 have type int. Therefore n=0 and n-1 involve integers,
so n has type int. Now n *p must be integer multiplication, so p has type int.
2.24 Polymorphic function declarations 65
Since p is returned as the result of facti , its result type is int and its argument
type is int × int. This fits with the recursive call. Having made all these checks,
ML can respond
If the types are not consistent, the compiler rejects the declaration.
This type is polymorphic because it contains a type variable, namely ’a. In ML,
type variables begin with a prime (single quote) character.
’b ’c ’we_band_of_brothers ’3
Let us write α, β, γ for the ML type variables ’a, ’b, ’c, because type variables
are traditionally Greek letters. Write x : τ to mean ‘x has type τ ,’ for instance
pairself : α → (α × α). Incidentally, × has higher precedence than →; the
type of pairself can be written α → α × α.
A polymorphic type is a type scheme. Substituting types for type variables
forms an instance of the scheme. A value whose type is polymorphic has in-
finitely many types. When pairself is applied to a real number, it effectively has
type real → real × real .
pairself 4.0;
> (4.0, 4.0) : real * real
Projection functions return a component of a pair. The function fst returns the
first component; snd returns the second:
fun fst (x ,y) = x ;
> val fst = fn : ’a * ’b -> ’a
fun snd (x ,y) = y;
> val snd = fn : ’a * ’b -> ’b
The type of fst is α × β → α, with two type variables. The argument pair may
involve any two types τ1 and τ2 (not necessarily different); the result has type τ1 .
Polymorphic functions can express other functions. The function that takes
((x , y), w ) to x could be coded directly, but two applications of fst also work:
fun fstfst z = fst(fst z );
> val fstfst = fn : (’a * ’b) * ’c -> ’a
fstfst pp;
> "Help!" : string
Its type, α → α, suggests that silly is the identity function. This function can
be expressed rather more directly:
fun I x = x ;
> val I = fn : ’a -> ’a
Further issues. Milner (1978) gives an algorithm for polymorphic type check-
ing and proves that a type-correct program cannot suffer a run-time type error.
2.24 Polymorphic function declarations 67
Damas and Milner (1982) prove that the types inferred by this algorithm are princi-
pal: they are as polymorphic as possible. Cardelli and Wegner (1985) survey several
approaches to polymorphism. For Standard ML, things are quite complicated.
Equality testing is polymorphic in a limited sense: it is defined for most, not all,
types. Standard ML provides a class of equality type variables to range over this re-
stricted collection of types. See Section 3.14.
Recall that certain built-in functions are overloaded: addition (+) is defined for in-
tegers and reals, for instance. Overloading sits uneasily with polymorphism. It com-
plicates the type checking algorithm and frequently forces programmers to write type
constraints. Fortunately there are only a few overloaded functions. Programmers cannot
introduce further overloading.