Elements of ML
Programming
ML97 Edition
Jeffrey D. Ullman
An Alan R. Apt B=
Saat
avis
Prentice Hall
Upper Saddle River, New Jersey 07458Ullman, Jeffrey D.
Elements of ML programming/ Jeffrey D. Ullman
ML97 Edition
pcm
‘An Alan R. Apt. Book”
Includes bibliological references and index.
ISBN: 0-13-790387-1
1. ML (Computer program language). I. Title
CIP DATA AVAILABLE
Acquisitions editor: ALAN R. APT
Editor-in-chief; MARCIA HORTON
Production editor: IRWIN ZUCKER
‘Managing editor: BAYANI MENDOZA DE LEON
Director of production and manufacturing: DAVID W. RICCARDI
Cover director: JAYNE CONTE.
Manufacturing buyer: JULIA MEEHAN
Egitorial assistant: TONT CHAVEZ,
© 1988, 1994 by Price, tn
| Upper Saale River, New Jey 7458
All rights reserved. No part of this book may be
reproduced, in any form or by any means,
‘without permission in writing from the publisher.
‘The author and publisher of this book have used their best efforts in preparing this book. These efforts include the
development, research, and testing of the theories and programs to determine their effectiveness. The author and
publisher make no warranty of any kind, expressed or implied, with regard to these programs or the documentation
contained in this book. The author and publisher shall not be liable in any event for incidental or consequential damages
in connection with, or arising out of, the furnishing, performance, or use of these programs.
Printed in the United States of America
098765
ISBN O-13-790387-1
Prentice-Hall International (UK) Limited, London
Prentice-Hall of Australia Pty. Limited, Sydney
Prentice-Hall Canada Inc., Toronto
Prentice-Hall Hispanoamericana. S.A.. Mexico City
Prentice-Hall of India Private Limited. New Dethi
Prentice-Hall of Japan. Inc., Tokyo
Pearson Education Asia Pte. Ltd.. Singapore
Editora Prentice-Hall do Brasil, Ltda.. Rio de JaneiroPreface
I became interested in ML programming when I taught CS109, the introdue-
tory Computer Science Foundations course at Stanford, starting in 1991. ML
was used by several of the instructors of this course, including Stu Reges and
Mike Cleron, to introduce concepts such as functional programming and type
systems. It was also used for the practical purpose of introducing a second
programming paradigm, other than the Pascal or C that students learned in
the introductory programming course. Reimplementing algorithms and data
structures in a significantly different language often is an aid to understanding
of basic data structure and algorithm concepts.
I first learned ML from the notes that Reges and Cleron had written for
their students. Initially, I was intrigued by the rule system, which gave me
much of the power of Prolog, a language with which I had worked for several
years. Yet ML did not introduce the semantic complexity that comes from the
use of unification and backtracking in Prolog. However, I soon discovered other
charms of ML: the type system, the use of exceptions, and the module system
for creating abstract datatypes, among others. From the Reges and Cleron
notes I also picked up the utility of giving the student a fast overview, stressing
the most commonly used constructs rather than the complete syntax.
In writing this guide to ML programming, I have thus departed from the
approach found in many books on the language. As an outsider, I had the
opportunity to learn the language from the standpoint of the typical program-
mer. I have tried to remember how things struck me at first, the analogies I
drew with conventional languages, and the concepts that I found most useful
in getting started. I hope that my selection is accurate, and that the book will
facilitate the reader’s transition from conventional languages to ML.
The Second Edition
You are reading the second edition of the book. The primary change between
the first and second editions is that the second conforms to the new language
standard called ML97. All major implementations of ML either have converted,
or are in the process of converting, to this standard. For the few matters that.
are implementation-dependent, such as the choice of diagnostics, the second
iiiiv
PREFACE
edition, like the first, follows the Standard ML of New Jersey (SML/NJ) im-
plementation. SML/NJ is the work of Andrew Appel of Princeton University,
David MacQueen of Lucent/Bell Labs, and their colleagues.
‘The following is both a summary of the second edition and a guide to the
correspondence between the first and second editions.
.
Chapter 1 corresponds to the old Chapter 0.
Chapter 2 lays the groundwork for ML programming. Sections 2.1 through
2.4 correspond to the old Chapters 1 through 4, respectively.
Chapter 3 introduces functions in ML. The old Chapter 5 is now Sections
3.1 and 3.2, while the old Chapter 6 appears in Sections 3.3 and 3.6. Old
Chapter 7 has become Section 3.4, while Sections 3.5 and much of Section
3.6 are new.
Chapter 4 covers ML input and output. The old Chapter 9 has been split
among Sections 4.1, 4.2, and 4.4, while the new Section 4.3 comes from
the old Chapter 22.
In Chapter 5 we return to the subject of functions in ML, presenting a
number of advanced topics. Section 5.1 covers matches and patterns like
the old Chapter 19. Section 5.2 covers exceptions, from the old Chapters
8 and 20. Section 5.3 covers polymorphism as in the old Chapter 10.
Material on higher-order functions from the old Chapters 11 and 21 is
now split among Sections 5.4 through 5.6. The case study in Section 5.7
was originally part of the old Chapter 22.
Chapter 6 introduces datatypes. The old Chapter 12 is split between
Sections 6.1 and 6.2, while old Chapter 13 is now Sections 6.3 and 6.4.
Chapter 7 presents a number of advanced topics about data structures.
Section 7.1 covers record structures, the old Chapter 18. Material on
arrays, the old Chapter 16, is now in Sections 7.2 and 7.4. Old Chapter
17, about references, is in Sections 7.3 and 7.5.
Chapter 8 covers the ML module system. Old Chapter 14 is split among
Sections 8.1 through 8.3, and old Chapter 15 is now in Sections 8.4 and
8.5. The case study in Section 8.6 is new.
Finally, Chapter 9 attempts to summarize the entire language. Some
concepts not appearing elsewhere in the book are introduced as well.
Section 9.1 on infix operators, corresponds to the old Chapter 24. Sections
9.2 and 9.3 summarize what is called the “top-level environment,” the set
of features one has in ML without asking for them explicitly. Then Section
9.4 summarizes the “standard basis,” or capabilities one can obtain if one
calls for them explicitly. Some of the old Chapter 25 is spread among
Sections 9.2 through 9.4, but much of these sections is new for ML97.PREFACE v
Section 9.5 corresponds to the old Chapter 23 and covers some important
features found only in SML/NJ, involving the creation of executable files.
Section 9.6 concludes with syntax diagrams for the entire language and is
an updating of the old Chapter 26.
Features of the Book
The test of a language is not the best or most succinct examples of its use.
Rather, a language will only be adopted widely if it can handle everyday pro-
gramming chores well. Thus, I have considered in this book many of the most
common data structures, such as trees and hash tables, and many of the most
common algorithms, such as sorting or Gaussian elimination. I think the reader
will be impressed by how well ML handles these standard tasks that were se-
lected because of their ubiquity, not because they exhibit special features of the
language.
‘To focus the reader's attention, I have inserted boxes at various places in
the text. These boxes are interruptions from the main focus of the text, but
they are sufficiently important that I want to make sure they are noticed. On
the other hand, footnotes are also interruptions to the main thread, but they
are there only “for the record” rather than as an aid to understanding.
Exercises
‘Most of the sections have exercises at the end. In the text, we indicate that an
exercise or part of an exercise has a published solution by preceding the exercise
or part of an exercise by a star. You can find solutions to exercises with stars
at URL http: //www-db. stanford. edu/“ul1man/emlpsols/sols.html.
Exercises are graded by difficulty. Harder exercises are indicated by an
exclamation point in the margin, and a few of the hardest exercises have two
exclamation points.
Use of the Book
The book as a whole is a tutorial and reference for the person who wants
to program productively in ML. Although there are occasional references to
conventional languages like C or Pascal, I believe the book is sufficiently self-
contained that it could be used to teach ML as a first programming language.
When we teach students ML in the CS109 course at Stanford, the material
covered corresponds closely to what is in Chapters 2, 3, 4.1, 5, 6, and 7. The
book can be used as a supplement to a programming language concepts course,
in which case Chapter 8 would surely be included.vi TABLE OF CONTENTS
Acknowledgments
I would like to thank Andrew Appel and David MacQueen, both of whom
carefully critiqued the original edition. Matthias Blume was equally a boon
for the second edition. They are all three tied for the title of “world’s greatest
referee.”
T value a number of important pointers on ML from John Mitchell. Also,
Henry Bauer, Richard LeBlanc, Peter Robinson, and Jean Scholtz have my
appreciation for their work as referees of the first-edition manuscript.
Errata from the first edition were found and pointed out to me by Baoquan
Chen, Franklin Chen, Martin Erwig, Mark Girod, Naomichi Komuro, Hugh
McGuire, Jeffrey Oldham. David Richardson, and Daniel Yankelevich.
Finally, I appreciate the help from several members of the core ML commu-
nity that kept me informed of changes and encouraged me to keep this book on
track. I even got help on the design of the cover for the first edition (see the
following note), which has been carried over to the second edition as well.
Cover Art
Special thanks go to Luca Cardelli, who volunteered to create original art for
this book. The result is on the cover.
Supplementary Material on the Web
The book’s home page is www-db.stanford.edu/~ullman/emlp.html. There
you can find:
1. Solutions to starred exercises.
2. Code for the major programs in the book.
Errata.
Notes and exams from Stanford’s CS109 involving ML.
see
Links to ML documentation and resources.
J.D. U.
Stanford CA
September, 1997Table of Contents
1 A Perspective on ML and SML/NJ
11 Why ML? ...............
1.2. Standard ML of New Jersey . . .
1.3 Prerequisites for the Reader... . .
14. References and Web Resources
1.5 Features of ML97..........
aoe
Oe
2 Getting Started in ML
21 Expressions 6... eee eee vee 9
2.1.1 Constants. ........ 10
2.1.2 Arithmetic Operators 14
2.1.3 String Operators : 15
2.1.4 Comparison Operators . . 15
2.1.5 Combining Logical Values eee 7
2.1.6 If-Then-Else Expressions ee eee eee 18
2.1.7 Exercises for Section 2.1 . 19
2.2. ‘Type Consistency... ...... bees 20
22.1 TypeBrros.............-0. 20
2.2.2 Coercion Between Integers and Reals . 24
2.2.3 Coercions Between Characters and Integers 25
2.2.4 Coercions Between Strings and Characters . 26
2.2.5 Exercises for Section 2.2 . . 26
2.3 Variables and Environments . . . . beet cee e es T
23.1 Identifiers... 2.0... 27
2.3.2. The Top-Level Environment . . . . 29
2.3.3 An Assignment-Like Statement 29
2.3.4 A View of ML Programming 31
2.3.5 Exercises for Section 2.3. . tee renee 8B
24 Tuples and Lists . rer rrr rrr rr rn 4
24.1 Tuples... cee vee 84
2.4.2 Accessing Components of Tuples . 35
243 Lists. ... 2.2.22 -20 008 36
24.4 List Notation and Operatorsviii
24.5
24.6
24.7
TABLE OF CONTENTS
Converting Between Character Strings and Lists
Introduction to the ML Type System
Exercises for Section 2.4 . .
3 Defining Functions
31
3.2
3.3
3.4
3.5
3.6
It’s Easy; It’s fun .
3.11
3.1.2
3.13,
3.14
3.1.5
3.1.6
Recursive Functions ..... .
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
Patterns in Function Definitions
3.3.1
3.3.2
3.3.3,
3.3.4
3.3.5
3.3.6
3.3.7
Local Environments Using let
34.1
3.4.2
3.4.3. Splitting Apart the Value Returned by a Function
3.44
3.4.5
Case Study: Linear-Time Reverse .
3.5.1
3.5.2
3.5.3
3.5.4
3.5.5
Case Study: Polynomial Multiplis
3.6.1
3.6.2
3.6.3
3.6.4
3.6.5
3.6.6
Function Types... .
Declaring Function Types
Function Application .
Functions With More Than One Parameter
Functions that Reference External Variables
Exercises for Section 3.1
Function Execution
Nonlinear Recursion . .
Mutual Recursion . . .
How ML Deduces Types
Exercises for Section 3.2
Patterns as Function Parameters see
“As” You Like it: Having it Both Ways.
Anonymous Variables 2.0.2...
What Is and What Isn't a Pattern?
How ML Matches Patterns... .. .
A Subtle Pattern Bug... 02...
Exercises for Section 33.........
Defining Common Subexpressions
Effect on Environments of let
Mergesort: An Efficient, Recursive Sorter
Exercises for Section 3.4
Analysis of Simple Reverse... .. .
ML's Representation of Lists . . .
‘A Reversal Function Using Difference Li
Analysis of Fast Reverse
Exercises for Section 3.5
ition .
Representing Polynomials by Lists .
A Simple Polynomial-Multiplication Algorithm
Analysis of Simple Multiplication . .
Auxiliary Functions for a Faster Multiplication . .
The Karatsuba-Ofman Algorithm . . .
Analysis of the Karatsuba-Ofman Algorithm
40
41
42
45
45
46
a7
49
50
52
52
54
56
39
60
63
64
65
65
67
69
69
72
74
74
7
78
79
80
82
83
84
84
85
86
88
88
88
89
89
91
92
94
97TABLE OF CONTENTS ix
3.6.7 Exercises for Section 3.6... ....-50-00-.5000008 98
4 Input and Output 101
4.1. Simple Output cone -101
4.1.1 The Print Function “101
4.1.2. Printing Nonstring Values . - 103
4.1.3 “Statement” Lists : - 104
4.1.4 Statement Lists Versus Let-Expressons - . 106
4.1.5 Exercises for Section 4.1 . - 107
4.2. Reading Input From a File » . 108
4.2.1 Instreams . . - 108
42.2 Reading Characters From a File . 109
4.2.3 Reading Lines of a File 110
4.2.4 Reading Complete Files . . . qi
4.2.5 Reading a Single Character . qu
4.2.6 Lookahead on the Input . . 113
4.2.7 Closing Instreams . . . 4
4.2.8 Exercises for Section 4.2 . . 15
4.3 Output to Files 116
4.3.1 Outstreams - 17
43.2 Closing Outstreams .. . 17
4.3.3. The output Command. . 118
43.4 Exercises for Section 4.3. 118
44° Case Study: Summing Integers . . eee 121
44.1 The Function startInt .. . Seen e ee 121
4.4.2 The Function finishInt . 123
44.3 The Function getInt 124
4.4.4 The Function sumInts . . 125
44.5 Eager Evaluation... . . - 125
4.4.6 Exercises for Section 4.4 . - 126
5 More About Functions 127
5.1 Matches and Patterns - 127
5.1.1 Matches... . vee es 128
5.12 Using Matches to Define Functions... 0.0... - + - - 128
5.1.3 Anonymous Functions . . see 129
5.1.4 Case Expressions... .. . 130
5.1.5 If-Then-Else Expressions Revisited . . 131
5.1.6 Exercises for Section 5.1. . + + 132
5.2 Exceptioss ............ . . 132
5.2.1 User-Defined Exceptions . 134
5.2.2. Expressions With Parameters 135
5.2.3 Handling Exceptions... .... . 136
5.24 Exceptions as Elements of an Environment : nM
i
5.2.5 Local Exceptions . . . .TABLE OF CONTENTS
5.2.6 Exercises for Section 5.2 . .
5.3. Polymorphic Functions... .. .
5.3.1 A Limitation on the Use of Polymorphic Functions . . . . 145
5.3.2 Operators that Restrict Polymorphism . 149
5.3.3. Operators that Allow Polymorphism . . 149
5.3.4 The Equality Operators . . 150
5.3.5 Exercises for Section 5.3 . . 154
5.4 Higher-Order Functions : 156
5.4.1 Some Common Higher-Order Functions 159
54.2 A Simple Map Function .......... 160
5.4.3. The Function reduce vee ss 162
5.4.4 Converting Infix Operators to Function Names... . . . 165
5.4.5 The Function Filter 165
5.4.6 Exercises for Section 5.4. . 166
5.5 Curried Functions cee 168
5.5.1 Partially Instantiated Functions . . . . 169
5.5.2 The ML Style of Function Application . 172
5.5.3 Exercises for Section 5.5.0... 0... +173
5.6 Built-In Higher-Order Functions 174
5.6.1 Composition of Functions . . 174
5.6.2 The ML Operator o For Composition - 175
5.6.3 The “Real” Version of Map... .... 176
5.64 Folding Lists ....... bev ees .179
5.6.5 Exercises for Section 5.6 181
5.7. Case Study: Parsing Expressions . . » 184
5.7.1 The Grammatical Structure of Arithmetic Expressions . . 184
5.7.2 Structure of the Parsing Program .
5.7.3 Detailed Explanation of the Parser Code
5.74 Exercises for Section 5.7... 0.0... 5+
Defining Your Own Types
6.1 Defining New Types .......
6.1.1 Review of the ML Type System
6.1.2 New Names for Old Types . Lee
6.1.3 Parametrized Type Definitions . . . .
6.1.4 Exercises for Section 6.1 .
6.2 Datatypes... ...
6.2.1 A Simple Form of Datatype Declaration
6.2.2 Using Constructor Expressions in Datatype Definitions. 199
6.2.3 Recursively Defined Datatypes
6.2.4 Mutually Recursive Datatypes
6.2.5 Exercises for Section 6.2 .
6.3 Case Study: Binary Trees... .
6.3.1 Binary Search Trees . . .
6.3.2 Lookup in Binary Search TreesTABLE OF CONTENTS xi
8.2
- 214
216
219
221
221
222
6.3.3 Insertion into Binary Search Trees .
6.3.4 Deletion from Binary Search Trees . .
6.3.5 Some Comments About Running Time
6.3.6 Visiting All the Nodes of a Binary Tree
6.3.7 Preorder Traversals. . . . . ee
6.3.8 Exercises for Section 6.3 . .
6.4 Case Study: General Rooted Trees » 223
6.4.1 A Datatype for Trees........-... + 223
6.4.2 Summing the Labels of a General Tree . + 225
6.4.3 Computing Sums Using Higher Order Functions - 227
6.4.4 Exercises for Section 6.4 . cee - 227
7 More About ML Data Structures 229
7.1 Record Structures... 0.2... . 229
7.1.1 Records and Their Types - 229
7.1.2 Extracting Field Values . . 231
7.1.3 Tuples as a Special Case of Record Structures - 231
7.14 Patterns That Match Records vee 232
7.1.5 Shorthands in Record Patterns - 235
7.1.6 Exercises for Section 7.1 . . » 236
7.2 Arrays eee 237
7.2.1 Why Do We Need Arrays? . . 237
7.2.2 Array Operations... 2.2.6 289
7.2.3 Exercises for Section 72......... : ee AL
7.3 References . . . : : + = 242
731 The ref Type Constructor... 0.11 sss sss ees 242
7.3.2 Obtaining the Value of a Ref- = 243
7.3.3. Modifying Ref-Variables . vce es 248
7.34 The While-Do Statement . . . : 244
7.3.5 Exercises for Section 7.3 fe es 245
7.4 Case Study: Hash Tables cece ee vee » 246
7.4.1 The Dictionary Operations 246
7.4.2 How a Hash Table Works 247
7.4.3. An Example of Hash Table Implementation... . . 247
744 Exercises for Section 7.4... .........2..+ 5+ 250
7.5 Case Study: Triangularization of a Matrix... 00.22... 250
7.5.1 Creating and Initializing the Matrix ........... . 251
7.5.2 Triangularization by Row Operations ........ 253
7.5.3 Exercises for Section 7.5 » 255
8 Encapsulation and the ML Module System 257
8.1 Why Modules? .. 00.0.0... eee eee 267
258
259
261
8.1.1 Information Hiding.) 1.1.1)
8.1.2 Clustering Connected Elements .
Structures oe eee eeexii
TABLE OF CONTENTS
8.2.1 Signatures. 0... eee +. + 262
8.2.2 Restricting Structures Through Their Signatures . + 264
8.2.3. Accessing Names Defined Within Structures... . . . . . 266
8.2.4 Opening Structures - 267
8.2.5 Exercises for Section 8.2 . = 268
83 Functors...........-05 - 270
8.3.1 Motivation for Functors . vee eee es 270
83.2 Using Functors to Import Information... 1... +... 272
8.3.3 More General Forms for Functor Parameters and Argu-
ments 277
8.3.4 Exercises for Section 8.3 . » 280
84 Sharings............0. 282
282
283
283
285
285,
286
84.1 Sharing Specifications. .
8.4.2 Substructures.......
84.3. Sharing of Types... ...
8.4.4 Sharing of Substructures .
8.4.5 Exercises for Section 8.4 . boas
8.5 ML Techniques for Hiding Information
8.5.1 An Information-Hiding Problem 287
8.5.2 Using Signatures to Hide Information .......... . 287
8.5.3 Abstract Types... .. « 289
8.5.4 Local Definitions... « 291
292
= 294
299
299
8.5.5 Opaque Signatures . . . .
8.5.6 Exercises for Section 8.5
8.6 Case Study: Feedback Shift Registers... .
8.6.1 Operation of a Feedback Shift Register . :
8.6.2. A Functor to Create Random Number Generators .
301
8.6.3 Generating a Feedback Shift Register ........ » 303
8.6.4 Exercises for Section 8.6... ....-...0.002505 305
Summary of the ML Standard Basis 307
9.1 The Infix Operators ..... see 307
9.1.1 Precedence - 308
9.1.2 Precedence Levels in ML - 309
9.1.3 Associativity of Operators . 310
9.14 Creating New Infix Operators 310
9.1.5 Infix Data Constructors. 313
9.1.6 Exercises for Section 9.1 2 315
. 316
316
316
.. 317
. 317
9.2 Functions in the Top-Level Environment...
9.2.1 Functions on Integers
9.2.2 Functions on Reals
9.2.3 Functions on Booleans . .
9.2.4 Functions on Characters
9.2.5 Functions on Strings . . 317
9.2.6 Functions on Options - 318TABLE OF CONTENTS xiii
9.3
94
9.5
9.6
318
318
320
321
- 321
. 323
9.2.7 Functions on References .
9.2.8 Functions on Lists... .
9.2.9 Functions on Exceptions .
9.2.10 Functions Affecting Return Values
9.2.11 Exercises for Section 9.2 .
‘Top-Level Types and Exceptions
9.3.1 Primitive Types... . . sees | 323,
9.3.2 Primitive Type Constructors... . .. fees. 324
9.3.3 Primitive Datatypes ............ eee BM
9.3.4 Top-Level Exceptions +. 326
9.3.5 Exercises for Section 9.3 . - . 328
Structures of the Standard Basis | 328
9.4.1 The Structure Int . 329
9.4.2 The Structure Word 330
9.4.3 The Structures Real and Math cee ee 5 333
9.4.4 The StructureChar .......... . » 334
9.4.5 The Structure String . . . 334
9.4.6 The Structure Substring . 336
9.4.7 The Structure List 338
9.4.8 The Structure Array . . 340
9.4.9 The Structure Vector . . 340
9.4.10 The Structure 0S 342
9.4.11 The Structures Time and Timer cee BAB
9.4.12 What If [Lose a Name?....... 345
9.4.13 Exercises for Section 9.4 345
Additional Features of SML/NJ 348
9.5.1 Exporting Functions . . .. 348
9.5.2 Exporting the ML Environment . » 351
9.5.3 Exercises for Section 9.5 . 352
Summary of ML Syntax .. . . . 353
9.6.1 Lexical Categories . . . . 353
9.6.2. Some Simplifications to the Grammatical Structure... . 354
9.6.3 Expressions . . : » 355
9.6.4 Matches and Patterns . . 357
9.6.5 Types 360
9.6.6 Declarations . . 362
9.6.7 Signatures 2. 368
9.6.8 Structures .. 37
9.6.9 Functors . » 374
9.6.10 Programs . 375
Index 377Chapter 1
A Perspective on ML and
SML/NJ
In this preliminary chapter we shall introduce the reader to the history of ML.
and the reasons for its existence and popularity. We shall also present the
mechanics of using a particular implementation of ML, called Standard ML
of New Jersey, which is the implementation used for examples in this book.
Finally, some general references on the language are given, including URL’s for
obtaining ML compilers and on-line documentation.
1.1 Why ML?
ML is a relatively new language that has some extremely interesting features.
Its designers incorporated many modern programming-language ideas, yet the
language is surprisingly easy to learn and use. In this section we shall enumerate
the most important of these features.
A Functional Language
ML is primarily a functional language, meaning that the basic mode of compu-
tation is the definition and application of functions. Functions can be defined
by the user as in conventional languages, by writing code for the function. But
it is also possible in ML to treat functions as values and compute new functions
from them with operators like function composition.
Side-Effect Freedom
A consequence of the functional style is that computation proceeds by eval-
uating expressions, not by making assignments to variables. There are ways
to give expressions side-effects, which are operations that permanently change
12 CHAPTER 1. A PERSPECTIVE ON ML AND SML/NJ
the value of a variable or other observable object (¢.g., by printing output).
However, side-effects are treated as necessary aberrations on the basic theme.
In contrast, languages like Pascal or C use statements with side-effects as a
matter of course. For example, a Pascal assignment like a := btc has a side-
effect, since the value of variable a is changed after the assignment is executed.
In contrast, when ML evaluates an expression like btc, it typically creates an
entirely new element with which to associate the result.
Higher-Order Functions
ML supports higher-order functions — functions that take functions as argu-
ments — routinely and with great generality. In comparison, languages like
Pascal or C support functions as arguments only in limited ways.
Polymorphism
ML supports polymorphism, which is the ability of a function to take arguments
of various types. For example, in Pascal or C we may have to create different
types with similar properties, such as “stack of integers,” “stack of reals,” “stack
of pairs of integers,” and so on, We would then have to define operations like
“push” and “pop” for each different type of stack. In ML, we can define one
notion of a stack, one push function, and one pop function, each of which works
no matter what type of elements our stacks have
Abstract Data Types
ML supports abstract data types through:
1, An elegant type system,
2. The ability to construct new types, and
3. Constructs that restrict access to objects of a given type so all access is
through a fixed set of operations defined for that type.
An example is a type like “stack,” for which we might define the push and pop
operations and a few other operations as the only way the contents of a stack
could be read or modified.
These abstract data types, called structures, offer the power of “classes”
used in object-oriented programming languages like C++, Java, or Smalltalk.
They are considered very important for such programming goals as modularity,
encapsulation of concepts, and reuse of software. However, the ML notion of
a structure also includes and generalizes several other important ideas, such as
the libraries of functions provided in many languages and “friend” classes in
C++.1.1. WHY ML? 3
Recursion
ML strongly encourages recursion in preference to iterators like for-loops or
while-loops that are used commonly in Pascal or C. Recursion generally pro-
vides a cleaner expression for computational ideas, especially when coupled
with ML’s functional programming style. We shall learn a natural, recursive
style of programming similar to that used in Lisp or Scheme. However, iterative
constructs are available in ML for the times when that style is most appropriate.
Rule-Based Programming
There is in ML an easy way to do rule-based programming, where actions are
based on if-then rules. The core idea is a pattern-action construct, where a value
is compared with several patterns in turn. The first pattern to match causes
an associated action to be executed. In this way, ML has much of the power
of Prolog and other languages that are thought of as “artificial intelligence
languages.”
Strong Typing
ML is a strongly typed language, meaning that all values and variables have a
type that can be determined at “compile time” (i.e., by examining the program
but not running it). A value of one type cannot be given to a variable of
another type. For example, the integer value 4 cannot be the value of a real-
valued variable, even though the real 4.0 could be the value of that variable,
Many other languages allow confusion of types. For example, C allows a value
to change its type arbitrarily, through the “cast” mechanism, while Lisp and
Prolog do not try to constrain types in general.
Strong typing is a valuable debugging aid, since it allows many errors to
be caught by the compiler, rather than resulting in mysterious errors when the
program is run, Interestingly, although most other strongly typed languages
require a declaration of the type of every variable, ML tries hard to figure out
the unique type that each variable may have, and only expects a declaration
for a variable when it is impossible for ML to deduce its type.
ML is not the only language to possess these features. For example, Lisp is
principally functional, supports higher-order functions, and promotes the use
of recursion. Prolog also promotes recursion and supports rule-based program-
ming naturally. Smalltalk and C++ offer powerful abstract-data-type facilities,
and so on. However, the combination of features found in ML offers the user @
great deal of programming ease. At the same time, ML allows one to use a full
palette of modern programming language concepts.4 CHAPTER 1. A PERSPECTIVE ON ML AND SML/NJ
1.2 Standard ML of New Jersey
In this book, we shall assume that the SML/NJ, or “Standard ML of New
Jersey,” implementation of ML is used. SML/NJ was implemented by David
MacQueen of Lucent Bell Laboratories, Andrew Appel of Princeton University,
and their colleagues. It is available for most UNIX workstations, for PC’s
running LINUX, and in an experimental version (at the time of this book's
writing) for PC’s running Microsoft Windows. There is a Web site at Bell
Laboratories from which software and documentation may be downloaded; see
the references in Section 1.4. SML/NJ version 109.30, upon which this book is
based, has been implemented to conform to the recent ML97 standard.
Interactive Mode
To run SML/NJ in interactive mode, in response to the UNIX prompt type
sm
SML/NJ will respond with:
Standard ML of New Jersey +»
Here, as throughout this book, we shall use italic font to indicate ML’s re-
sponses, while text typed by the user will be in the “teletype” font, as sml
above.
‘The dash on the second line is ML’s prompt. The prompt invites us to type
an expression, and ML will respond with the value of that expression. We can
make definitions and enter expressions indefinitely, and SML/NJ will respond
to each with the resulting value.
© To terminate an SML/NJ session, type
d.
Direct Program Execution
It is also possible to get SML/NJ to execute a program in a conventional way.
For example, if your ML program is in file foo, give the sml command with
that file as standard input:
sml < foo
Another option is to issue the sm1 command to UNIX, which gets us started
in interactive mode. Then, in response to the prompt, read and execute a file
foo that contains an ML program. We do so by typing to ML the expression
use "foo";1.3. PREREQUISITES FOR THE READER 5
‘Any quoted UNIX path name can appear in place of "foo". This mode is handy
when we are debugging a program and want to read in its definitions and then
try them in interactive mode.
There is a third way to get an ML program to run using SML/NJ. One
can compile an ML program source file into a file that includes the SML/NJ
runtime system and thus can be executed directly without invoking command
snl. This mode of operation is discussed in Section 9.5.
What ML Gives You
When we invoke ML, we are given access to several resources. In ML, all
available capabilities — operators like + for addition, functions like sin, and
some very complex operators not present in other languages — are organized
into structures, such as Int and Real, as suggested in Fig. 1.1. The entire
collection of capabilities is called the standard basis.
Top-level
[ [ Environment
Int Real String 000
Structures
Figure 1.1: Organization of resources in ML
A structure in ML is akin to a library in most other languages. For example,
the structure called Int contains many functions useful for dealing with inte-
gers, such as the arithmetic and comparison operators, but also includes some
less typical operators. Thus, ML selects the most important operators from the
various structures and puts them in the top-level environment. These capabil-
ities are available when we invoke ML. The additional capabilities, those that
are found in the various structures but that are not part of the top-level envi-
tonment, are also accessible if we make a small amount of additional effort. We
shall describe how to access those capabilities not in the top-level environment
starting in Section 4.1.2.
1.3. Prerequisites for the Reader
We assume the reader is familiar with programming in some conventional lan-
guage such as Pascal or C. Occasionally, as a matter of interest, we shall6 CHAPTER 1. A PERSPECTIVE ON ML AND SML/NJ
compare ML constructs with those of Pascal or C, but familiarity with one or
both of these languages is not essential.
It is also assumed the reader is familiar with the process of writing and
debugging programs in a conventional language. We expect that the reader
has written at least a few recursive programs and has some comfort with that
style of programming. However, our first recursive examples will be covered
in sufficient detail that the style may be learned here. In addition, we assume
the reader is familiar with simple data structures and data structure concepts
such as records, pointers, lists, and trees. The author immodestly recommends
Foundations of Computer Science: C Edition by A. V. Aho and J. D. Ullman,
Computer Science Press, New York, 1995 for the reader who desires further
background on these subjects.
1.4 References and Web Resources
‘The original definition of Standard ML is from [3], which evolved into the book
[5]. An elaboration of this work is [4]. This version of ML is now given the
retronym ML90.
The recent revision, called ML97, is described in the book [6]. An important
part of the definition of ML97 is the standard basis, which is obtainable on-line
in [1
Uh original paper on the Standard ML of New Jersey implementation as-
sumed in this book is [2]. There is an extensive resource library available on-line.
The root URL is:
http: //cm.bell-labs. com/cm/cs/what/sm1nj
A useful on-line document is
http: //cm.bell-labs. com/cm/cs/what/sm1nj/top-level-comparison.htm]
which explains the difference between earlier versions of SML/NJ and the cur-
rent, ML97-based versions starting with Version 109.24.
To obtain software to run SML/NJ on various hosts, start at
http: //cm.bell-labs . com/cm/cs/what/sm1nj/software html
An important source for non-UNIX implementations of ML is
http: //www.dina.kv1.dk/“sestoft/mosml html
which is the Keldysh Institute of Applied Mathematics in Moscow. Their im-
plementation runs on PC’s and MAC's, as well as workstations, and largely
conforms to ML97 at the time this book was written.
1. Appel, A. W., N. Barnes, D. Berry, E. R. Gansner, L. George, L. Huels-
bergen, D. MacQueen, B. Monahan, C. Miller, J. H. Reppy, J. Thackray,
and P. Sestoft, The Standard ML Basis Library. Its URL is:FEATURES OF ML97 7
http: //cm.bell-labs.com/cm/cs/what/smlnj/sm197.html
Appel, A. W. and D. B. MacQueen, “Standard ML of New Jersey,” Inter-
national Symposium on Programming Languages, Implementation, and
Logic, pp. 1-13, Springer-Verlag, 1991 is a technical article describing the
SML/NJ system.
. Harper, R. M., D. B. MacQueen, and R. Milner, “Standard ML,” ECS-
LFCS-86-2, Laboratory for Foundations of Computer Science, Edinburgh
University, Dept. of CS, 1986.
. Milner, R. and M. Tofte, Commentary on Standard ML, MIT Press, Cam-
bridge MA, 1991.
. Milner, R., M. Tofte, and R. M. Harper, The Definition of Standard ML,
MIT Press, Cambridge MA, 1990.
. Milner, R., M. Tofte, R. M. Harper, and D. B. MacQueen, The Definition
of Standard ML (Revised), MIT Press, Cambridge, MA, 1997.
1.5 Features of ML97
If you are familiar with the earlier version of ML called ML90, then you will
notice certain differences between ML90 and the version ML97 covered in this
book. If you are not familiar with ML90, then skip this section. The complete
list of changes is found in the ML97 source book, reference [6] above. However,
for the reader with ML experience, the following is an incomplete list of the
changes that are most likely to affect your programming.
1
There is a cleaner organization to features that are available in the top-
level basis and features that are available through a library structure.
Certain values with unknown type, such as nil, are no longer legal ex-
pressions. However, polymorphic functions remain a feature of ML.
Input/output operators are now defined as part of the standard, rather
than being implementation-dependent.
Characters are now a separate type, different from strings of length 1
Reals are no longer an equality type; i.e, you cannot test r = s for reals
rand s.
‘There is an unsigned integer type called word. Both words and ordinary
integers may be represented in hexadecimal, if we wish.
A datatype called option is provided to represent elements that are op-
tionally missing.CHAPTER 1. A PERSPECTIVE ON ML AND SML/NJ
8. Requirements to specify types are reduced because there is a default type
(integer) for overloaded operators such as + or <.Chapter 2
Getting Started in ML
In this chapter we shall introduce the reader to the simplest form of program-
ming in ML, where one types expressions to the ML system and receives back
values for these expressions. We shall learn how to construct expressions using
atomic types such as integers and strings. We shall also discuss expressions in-
volving lists and a simple form of record structure called tuples; lists and tuples
are both basic ML constructs. The reader will also see an important difference
between ML and most other languages: the ML rules regarding types of ex-
pressions allow the ML compiler to check at compile time for type errors that
in other languages can lead to mysterious run-time bugs.
2.1 Expressions
When we are in interactive mode, the simplest thing we can do is type an
expression in response to the ML prompt (-). ML will respond with the value
and its type
Example 2.1: Here is an example of an expression that we may type and the
ML response.
14243;
val it = 7: int
Recall from Section 1.2 our convention that we use “teletype” font for things
we type and italic font for the response of the ML system. Here, we have typed
the expression 1 + 2* 3, and ML responds that the value of variable it is 7,
and that the type of this value is integer. The variable it plays a special role
in ML. It receives the value of any expression that we type in interactive mode.
a
Two useful points to observe from Example 2.1 are:10 CHAPTER 2. GETTING STARTED IN ML
« An expression must be followed by a semicolon to tell the ML system that
the instruction is finished. If ML expects more input when a
is typed, it will respond with the prompt = instead of -. The = sign is a
warning that we have not finished our input expression.
‘© The response of ML to an expression is:
The word val standing for “value,”
The variable name it, which stands for the previous expression,
‘An equal sign,
The value of the expression (7 in this example),
se ene
‘A colon, which in ML is the symbol that associates a value with its
type, and
6. An expression that denotes the type of the value. In our example,
the value of the expression is an integer, so the type int follows the
colon.
2.1.1 Constants
As in any other language, expressions in ML are composed of operators and
operands, and operands may be either variables or constants. At this point,
we have not yet discussed the way values may be assigned to variables, so it
does not make sense to use variables in expressions. However, syntactically,
variables present no surprises. You may think of Pascal identifiers (letters
followed by letters or digits) or the identifiers in your favorite language as names
for ML variables, although as we shalll see in Section 2.3.1, ML identifiers differ
somewhat from identifiers in these languages.
ML provides as part of its top-level environment (see Section 1.2) a number
of types that are similar to those found in most languages. There is also a way
to get from the system some additional types. In this preliminary discussion,
we are going to introduce only the most commonly used atomic types and their
allowable values. The complete set of types is discussed in Section 9.3.
Integers
Integers are represented in ML as in other languages, with one exception involv-
ing the minus sign. A positive integer is a string of one or more digits, such as
0, 1284, or 11111111. A negative integer is formed by placing the unary minus
sign, which is the tilde (~), not a dash, in front of the digits, such as ~1234.
Integers may also be represented in hexadecimal notation, where the char-
acters Ox or OX are followed by a string of hexadecimal digits. Recall that the
hexadecimal digits are 0 through 9 and A through F, with the letters standing for
“digits” with values 10 through 15 (in decimal), respectively. The hexadecimal
digits that are letters may be written in either upper or lower case.2.1. EXPRESSIONS ul
Example 2.2: Here are the responses of the ML system to some expressions
that are hexadecimal integers.
0x1234;
val it = 4660 : int
Here, 1234 in hexadecimal, whose decimal value is
1 x 1728+ 2 x 14443 x 1244 = 4660
is converted to decimal in ML’s response. Notice that ML gives you a deci-
mal representation, regardless of whether you write the integer in decimal or
hexadecimal.
“OxaA;
val it = “170 : int
Here, we notice that either a or A stands for the hexadecimal digit “10,” and
upper and lower case can be mixed. We also see that the negation symbol ~
may be used in hexadecimal integers. O
Reals
Reals are also represented conventionally, with the exception that minus signs
within reals are represented by ~. An ML constant of type real thus consists
of
1. An optional ~,
2. A string of one or more digits, and
3. One or both of the following elements:
(a) A decimal point and one or more digits.
(b) The letter E or e, an optional ~, and one or more digits.
As in other languages, the value of a real number is determined by taking the
number that appears before the E or e and multiplying it by 10 raised to the
power that is the integer that follows.
Example 2.3: Here are some examples of real numbers:
1, 123.0 is the negative real that happens to have an integer value ~123.
2. 3E°3 has value .003.
3. 3.14e12 has value 3.14 x 10!
o12 CHAPTER 2. GETTING STARTED IN ML
Booleans
There are two boolean values: true and false. ML is case-sensitive (unlike
some other languages such as Pascal or SQL that also use true and false as
boolean constants), so these constants must be written in lower case, never as
TRUE, False, or any other combination involving capitals.
Example 2.4: Here is what happens when we type a boolean value.
true;
val it = true : bool
Notice that the type of booleans is bool in the ML response. O
Strings
Values of type string are double-quoted character strings like "£00" or "R2D2".
Certain special characters are represented by sequences of characters, as in
the language C, where the backslash (\) serves as an escape character. The
principal ways to represent characters that cannot be typed on the keyboard,
or characters with a special meaning that would confuse the interpretation of
strings, are:
1. The two-character sequence \n is used for the “newline” character.
\t is used for the tab character.
\\ is used for the backslash character.
ee LN
\" stands for the double-quote character, which otherwise would be in-
terpreted as the string ender.
5. A backslash followed by three decimal digits stands for the character
whose ASCII code is the number represented by those three digits, in
base 10. This convention allows us to type characters for which there is
no key on the keyboard. For example \007 is the “character” that rings
the bell on the console.
6. Those characters that are control characters can also be written by the
three character sequence consisting of a backslash, the caret or uparrow
symbol *, and a character whose ASCII code is in the range 64-95 (dec-
imal), i.e., the capital letters and the five characters (\]*_ The actual
character represented is determined by subtracting 64 from the ASCII
code for the character typed. For example, \°G stands for G
and is the same bell-ringing character that is represented by \007.
There are certain other escape sequences that are less commonly used or that
may not be supported by a given ML implementation. See the box on “Other
Character Codes.”2.1. EXPRESSIONS 13
Example 2.5: The string "A\tB\tC\n1\t2\t3\n" is printed as
A B c
1 2 3
Here we sce uses of the tab sequence \t and the newline sequence \n.
If a string is too long to be written conveniently on a single line, we may
continue it over several lines. We make all but the last line end with a backslash,
and all but the first line begin with a backslash.
Example 2.6: The text of item 4 above could be written as a string extending
cover three lines as follows:
"\\\" stands for the double-quote character, \
\which otherwise would be interpreted \
\as the string ender."
In the first line, the first quote is not part of the string but indicates that
a string follows. The first two backslashes represent the character \. The
third backslash and the quote represent the character " (the second character
of item 4 above). The backslash at the end of the first line indicates that the
string continues on the next line. Note that the space after the comma is shown
explicitly on the first line. If that space were missing, the represented string
would look like ...character,which....
In general,
* Any sequence of characters beginning and ending with the backslash and
containing between the backslashes only “whitespace” characters such as
blank, tab, and newline, is ignored in interpretation of strings.
In Example 2.6, we used sequences consisting of a backslash, a newline, and
another backslash to make a string break over several lines without the newlines
becoming part of the string.
Characters
As in C, there is a distinction between a character string of length one and a
single character. ML provides a type char for characters. The representation
of character values in ML is somewhat unusual: The character # followed by
character string of length one. That is, #"2" represents the character 2.
Example 2.7: Character a is represented by #"a". The tab character is rep-
resented by #"\t". O4 CHAPTER 2. GETTING STARTED IN ML
Other Character Codes
ML also provides the following escape sequences: \a, \b, \v, \f, and \r
for the ASCII characters 7, 8, 11, 12, and 13, which are the bell-ringing
character, backspace, vertical tab, form feed, and carriage return, respec-
tively. In addition, the ML97 standard permits, but does not require, that
an implementation support an extended ASCII character set of up to 16
bits, such as the 16-bit character code used in the language Java. If an im-
plementation supports such an extended set (SML/NJ version 109.30 does
not), then one can represent such characters by the sequence \u followed
by four hexadecimal digits.
2.1.2 Arithmetic Operators
‘The arithmetic operators of ML are similar to those of Pascal or C. There are:
1. The low-precedence “additive” operators: +, -.
2. The high-precedence “multiplicative” operators: *, / (division of reals),
div (division of integers, rounding down toward minus infinity), and mod
(the remainder of integer division).
3. The highest precedence unary minus operator, ~.
However, note the following.
¢ A unary minus sign is always denoted by a tilde (~), never by a dash.
Thus, we write “#4 and 3-4, but never 374 or ~3+4,
© ML is case-sensitive, so the operators mod and div must be written in
lower case.
* Associativity and precedence is like Pascal or C; higher precedence op-
erators are grouped with their operands first, and among operators of
equal precedence, grouping proceeds from the left. Grouping order can
be altered by parentheses in the usual manner.
Example 2.8: Here are some expressions and their responses from the ML
interpreter.
3.0 - 4.5 + 6.7;
val it = 5.2: real
Note that grouping of equal precedence operators is from the left. This expres-
sion is interpreted as (3.0 — 4.5) + 6.7, not 3.0 ~ (4.5 + 6.7), which has value
-8.2.2.1. EXPRESSIONS 15
43 div (8 mod 3) * 5;
val it = 105 : int
All three operators div, mod, and * are of the same precedence, but the
parentheses force us to use the mod first, then group from the left. Since mod
calls for the remainder when its left argument is divided by the right, the value
of 8 mod 3 is 2. We thus evaluate (43 div 2)*5, or 105. O
2.1.3 String Operators
We may not apply the arithmetic operators to string operands. There is, how-
ever, one operator that applies to strings and only to strings. The operator ~
stands for concatenation of strings; it has the precedence of an additive opera-
tor. When we concatenate two strings s1 and s2, we get the string s182. That
is, the resulting string is a copy of string 81 followed by a copy of s2.
Example 2.9: Here are some examples of string concatenation.
"house" * "cat";
val it = "housecat” : string
"Linoleum" >"
val it = “linoleum” : string
Notice in the second example that "" represents the empty string, the string
with no characters. When we concatenate the empty string with any other
string, either on the left or right of the ~ operator, we get the other string as a
result. O
2.1.4 Comparison Operators
The six comparison operators that we find in Pascal are also part of the ML
repertoire. These are =, <, >, <=, >=, and <>, representing, respectively, the
comparisons =, <, >, S, >, and #. They can be used to compare integers,
reals, characters, or strings, with one exception:
Reals may not be compared using = or <>. The other four comparisons of
reals, such as <, are permitted, however.
In the case of characters, ¢1 < cz means “lexicographically precedes”; that
is, the character code for ¢) is less than the character code for c2. Similarly, <=
means “equals or lexicographically precedes,” and so on.
For strings, < is lexicographic order, just as < in Pascal or stremp in C.
That is, if s; and sp are strings, then s, < s2 if either
1. 8; is a proper prefix of $2, or16 CHAPTER 2. GETTING STARTED IN ML
Why Can’t We Test Reals for Equality?
‘The policy that forbids testing r = s in ML, when r and s are real quanti-
ties, is motivated by the fact that all machines perform real arithmetic only
approximately. Thus, in some circumstances, two real-valued expressions
that are theoretically equal could turn out, because of rounding error, to
be unequal in the machine. If you definitely want to test whether r = s,
you can test both r 4;
val it = true : bool
ML does not evaluate the second condition (3 > 4), since the first being true
is sufficient to guarantee that the whole expression is true. Remember that the
result of a comparison is a boolean, so it makes sense to connect two comparisons
by a logical operation such as orelse.
In the following expression:
1<2 andalso 3>4;
val it = false : bool
it is necessary to evaluate both conditions. Had the first condition been false,
then there would have been no need to check the second, because the whole
expression could only be false.
Because the symbol not has such high precedence, we must. be careful to
group its argument properly. Here is an example.
Example 2.12: The expression not 1<2 is grouped as (not 1)<2, which
makes no sense and is a type error in ML. We would have to write not (1<2),
although the simpler expression 1>=2 would do as well.
Incidentally, one might wonder why it matters whether or not the second
operand of a logical operation is evaluated, if the result of the entire expression
cannot depend on that operand. The reason is that in some special cases,
an ML expression can have a side-effect, which is an action whose effect does
not disappear after the expression is evaluated. The most common example of
a side-effect is when something inside an expression causes information to be18 CHAPTER 2. GETTING STARTED IN ML
printed or read. We have not yet seen any ML operator that has a side-effect,
and indeed it is in the ML style to avoid side-effects normally. However, side-
effects are possible, as we shall see in Section 4.1 and elsewhere. When they
occur, it is essential that we understand the conditions under which part of an
expression will not be evaluated and its side-effects consequently not performed.
Remember to use andalso and orelse, never and and or, for the logical
operations. There is no special meaning for or in ML, but and has another
meaning entirely, having nothing to do with logical operations.
2.1.6 If-Then-Else Expressions
ML lets us use conditional expressions of the form if E then F else G. We
compute the value of this expression by first evaluating expression E, which
must have a boolean value. If that value is true, then we evaluate expression
F (and never evaluate G); the value of F becomes the value of the entire if-
then-else expression. If the value of E is false, then we evaluate only G, which
becomes the value of the entire expression.
Example 2.13: Consider the following conditional expression:
if 4<2 then 3+4 else 5+6;
val it = 7: int
We begin by evaluating the expression between the if and then. In this case,
the expression 1 < 2 evaluates to true. Thus, we evaluate the second expres-
sion, 3+4. The result, 7, is the value of the entire expression. We do not
evaluate the expression 5+ 6, and if in its place there were an expression with
side-effects, those side-effects would not be executed.
Here are a few important points about conditional expressions.
© The conditional, or if-then-else operation, is one of the rare operations
that takes more than two operands. There is, however, a similar three-
operand (ternary) operator in C, using the characters ? and : in place
of then and else (nothing in place of if).
© Iéthen-else forms an expression. It is not a control-flow construct that
groups statements together, as we find in most languages.
There is no if --- then construct in ML. Such an expression does not
have a value when the condition is false. This point emphasizes the dif-
ference between if-then-else as an expression form and as a control-flow
construct. There is no harm in having a control-flow construct if-then,
since it simply executes no statements if the condition is false. However,
an if-then expression might return no value at all and thus could not be
used inside larger expressions.2.1. EXPRESSIONS 19
Case Sensitivity in ML
ML is case-sensitive, and operators whose names are composed of letters
are written with lower-case letters only. For example, we must be careful
to write not, andalso, if, mod, and so on.
‘There might appear to be an exception concerning letters used in the
expression of certain constants. For instance, we saw that either E or e
may be used in real constants, and hexadecimal integers can be introduced
with either Ox or OX. In fact, the hexadecimal digits themselves can be
written in either upper or lower case. However, this phenomenon is not an
exception to case-sensitivity. The ML standard simply allows alternative
forms of expression of certain constants.
2.1.7 Exercises for Section 2.1
Exercise 2.1.1: What is the response of ML to the following expressions?
*a) 14243
b) 5.0-4.2/1.4
*c) 11 div 2 mod 3
d) "foo"""bar"=""
* c) 3>4 orelse 5<6 andalso not (7<>8)
f) if 6<10 then 6.0 else 10.0
* g) OXAB+123
h) Oxab<123
Exercise 2.1.2: The following ML “expressions” have errors in them. Explain
what is wrong with each.
*a) 8/4
b) if 2<3 then 4
*c) 162 and 5>3
d) 647 Dv 2
* 0) 4.43.5
f) 1.02.0 or 3>420 CHAPTER 2. GETTING STARTED IN ML
* g) Baan
h) 123.
*1i) 1.0 = 2.0
Exercise 2.1.3: Write a string that when printed creates the displayed text on
lines (3)~(5) of Example 2.6. You may assume that the indentation of the lines
is made by a single tab character. Your string should be written over several
lines so there are no more than 80 characters appearing on any one line.
Exercise 2.1.4: Express:
* a) Eorelse F
b) EB andalso F
as if-then-else expressions. Incidentally, in ML, expressions formed with the
symbols orelse and andalso are actually shorthands for these if-then-else ex-
pressions.
2.2 Type Consistency
Having seen some of the important building blocks of expressions, we must now
learn what can go wrong when we use expressions built from these operators.
ML assigns a unique type to every expression. Operators also have partic-
ular types that they require their operands to have. Certain operators take
operands of one particular type only. Examples are /, which requires operands
of type real, div, which requires operands of type integer, and , which requires
operands of type string. Others, like + or *, can take arguments of different
types, e.g., two integers or two reals. As we shall see shortly, it is not possible
to mix operands of integer and real types, as it is in C or most other languages.
However, ML provides certain operators for converting from a value of one type
to an “equivalent” value of another type. We shall also learn a number of these
“coercion” operators in this section.
Let us again remind the reader that there is a purpose to this seeming inflex-
ibility on the part of ML. It enables the ML compiler to type-check programs
completely. Thus, no program that can run at all can have a type error. The
advantage to the programmer is that what could be a run-time bug in another
language’s program is caught by the ML compiler.
2.2.1 Type Errors
As we saw in Example 2.1, when an operator is given operands of the proper
type, it responds with the result. However, when one or both operands are of
the wrong type, we get an error message. The nature of error messages depends2.2. TYPE CONSISTENCY 21
on the particular implementation. We shall use the responses from SML/NJ
version 109.30 in examples.’
Example 2.14: The operator + can take either integer or real arguments.
However, both operands must be the same type. When the types of the
operands are the same, ML attributes the same type to the result, for instance:
1+ 25
val it = 3: int
1.0 + 2.0;
val it = 3.0 : real
On the other hand, when the operands are of mixed type, we get an error
message, as shown in Fig. 2.1. Let’s see what ML is telling us. The first line of
the response says that the operator expects operands of types other than what
it saw. The second line of the response tells us that the operator + expects
an “operand” whose type is a pair of integers. Although + can apply to either
integers or reals, the fact that the left argument 1 is an integer suggests that
integer addition was meant here.
142.
Error: operator and operand don't agree [literal]
operator domain: int * int
operand: int * real
in expression:
+: overloaded((1 : int),2.0)
Figure 2.1: A type error and its diagnostic message
‘The * operator in the expression int * int is not multiplication, but rather
an operator that applies to types and produces a product type, that is, the type
of a pair, triple, or so on. In particular, int * int is the type of any pair
of integers, for example, of the pair (1,2). ‘This response makes us aware of
a rather rigid view ML has of operators and operands. Strictly speaking, all
operators in ML are unary, that is, they take a single argument. A binary
(two-argument) operator like + is perceived by ML as taking a single argument
that is a pair. In most situations there is no problem with viewing a binary
operator as if it had two operands, but there are some differences that we shall
address in Section 5.5.
The third line of the response tells us what ML saw as the operand of the
operator, namely a pair whose first component (the left operand) is an integer
THowever, SML/NJ also returns a line and column number locating the point at which it
detected the error. We do not show this response since it is rarely meaningful out of context,22 CHAPTER 2. GETTING STARTED IN ML
but whose second component (the right operand) is a real. The final two lines
indicate the expression in which the error occurred. The only additional nuance
is that the operator and operand are shown in the conventional ML prefix form,
with the operator + appearing in front of its operand, which is the pair (1,2.0).
a
In the fifth line of Fig.. 2.1, we note the use of the term “overloaded” in
reference to +. An operator is overloaded if it can apply to two or more different
types, as + can. Notice that the fact + is defined for two integers or two reals
does not mean that it can be applied to one of each. Similar comments apply
to overloaded operators like -, *, <, and the other comparison operators.
If we were to use an operand of the wrong type with an operator that is not
overloaded, we get an error message similar to Fig. 2.1, but without the word
“overloaded.” An example follows.
Example 2.15: The expression
peeeione
applies the nonoverloaded operator ~, which concatenates strings, to a character
and a string. The error message would look like:
Error: operator and operand don't agree [literal]
operator domain: string * string
operand: char * string
in expression:
* (#?a","be")
o
Another type of error involves applying an operator, overloaded or not, to
operands at least one of which has a type inappropriate for the operator.
Example 2.16: The division operator / applies only to reals, as we learned
in Section 2.1.2. Here is what happens when this operator is misused.
1/25
Error: overloaded variable not defined at type
symbol: /
type: int
Our first observation is that the error message talks about the symbol / and
its application to the type int, which we know is improper. However, what is
the “overloaded variable” in the first line of the error message? ML thinks of /
as a variable. As we shall see in Section 2.3.1, / is a legitimate identifier for a
variable in ML, unlike most languages, where variable identifiers are restricted
to letters and digits plus perhaps a few other symbols. Although we said that
/ applies only to “reals,” an implementation of ML may support several kinds
of reals, such as single- and double-precision numbers. Thus, / might indeed
be defined for several different types. O2.2. TYPE CONSISTENCY 23
Another place where type mismatches may occur through carelessness is in
an if-then-else expression. The rules regarding types for this expression are:
« The expression following if must have boolean type.
« The expressions following then and else can be of any one type, but they
must be of the same type.
Example 2.17: Figure 2.2 shows what happens when the types of the expres-
sions following then and else disagree. Here, one is a character and one is a
string.
if 1<2 then #"a" else "be";
Error: types of rules don’t agree [tycon mismatch]
earlier rule(s): bool + char
this rule: bool + string
in rule:
false > "be”
Figure 2.2: A mismatch between the then and else parts
Obviously, ML is telling us something about finding a string (ie., "be")
when it expected a character to match the character #"a" that followed the
then. But what's this about “rules”? The explanation lies in the fact that the if-
then-else expression is really a shorthand for a more general kind of expression:
the case expression. We shall cover the case expression in Section 5.1.4
For the moment, let us just note that ML’s view of the if-then-else is that,
it involves two “rules,” each of which takes a boolean value and produces a
value of some one type. The first of these rules associates the boolean value
true with the character #"a". This rule expresses the principle that if the
condition is true, we use the value of the expression that follows the then.
‘The second rule associates the boolean value false with the value following
the else, namely "bc" in this case. However, ML expects to find another
character-valued expression following else, which it will then associate with
false in the second rule. ML is unhappy that it has found a string-valued
expression, because ML will not tolerate groups of rules that produce values of
different types. O
« The word “tycon” in the first line of response in Fig. 2.2 is short for “type
constructor,” that is, a way of constructing types from simpler types.
Rules in the sense used in Fig. 2.2 are actually of a function type, mapping
booleans to some other type. We discuss function types in Section 3.1 1.2 CHAPTER 2. GETTING STARTED IN ML
Applying Functions, ML Style
ML offers us a diction for applying a function or operator to an argument
that may be unfamiliar to some: f x means “apply function f to argument
x,” just as f(z) does in C or most other languages. Since there is no
harm in putting parentheses around an argument, we have used the more
conventional style, writing real(1) instead of the preferred ML style:
real 1. By adhering to the more familiar style, with parentheses, we hope
to focus attention on the more significant issues of ML, without adding
to the “newness” of the language. However, as the book progresses, we
shall gradually shift to the ML style of omitting the parentheses around
the argument of a function whenever appropriate.
2.2.2 Coercion Between Integers and Reals
Sometimes we have a reason to convert (coerce) a value of one type to an
“equivalent” value of another type. Thus ML provides certain built-in functions
that do the conversion for us. Perhaps the clearest case is when we want to
convert an integer to a real with the same value. The function real lets us do
just that.
Example 2.18: Applied to an integer, real produces the equivalent real value
as:
real (4
val it = 4.0 ; real
As another instance, we can fix Example 2.14, where we tried to add an
integer and a real, if we first apply real to the integer.
real(1) + 2.0;
val it = 3.0 : real
shows a correct version of this addition. Of course, there is no point in writing
real (1) instead of 1.0, but if we replaced 1 by an integer-valued variable, we
would have no choice but to convert the variable by applying the operator real
toit. O
When we try to convert a real to an integer, it is not so clear which integer
‘we want, since the real may not equal any integer. ML provides four coercion
operators: floor, ceil (ceiling), round, and trunc (truncate). Each produces
the integer with the same value when given a real that happens to be an in-
teger; for instance, 4.0 is converted to 4 by each of these four functions. In
general, given a real number r, floor produces the greatest integer that is no2.2. TYPE CONSISTENCY 25
larger than r, and ceil produces the smallest integer no less than r, Function
round produces the closest integer, with 0.5 raised to the next highest integer,
regardless of whether the real is positive or negative. The trunc function drops
digits to the right of the decimal point.
Example 2.19: Figure 2.3 shows the effect of these four operators on positive
and negative real numbers. We include the special case of a half-integer (3.5
and —3.5), noticing that rounding occurs upward. We also include typical cases
where the rounding is to the closest integer. Notice that floor and trunc do
the same thing on positive numbers, but trunc agrees with ceil on negative
numbers. Remember that —3 is “larger” than -3.5 and —4 is “smaller.” 0
x_| floor(x) | ceil(x) | round | trunc(x)
3.5 3 4 4 3
73.5 “4 “3 “3 73
3.4 3 4 3 3
3.6, 4 3 3 3
Figure 2.3: Effect of real-to-integer coercion operators
2.2.3 Coercions Between Characters and Integers
We convert from characters to integers, just as in Pascal, using the ord function
(which, however, must be lower case in ML). The result of applying ord to a
character is the integer code for that character. Normally, the character will
be one of the ASCII characters, and ord will return the ASCII code for that
character.
Example 2.20:
ord (#"a") ;
val it = 97: int
ord(#"a") - ord(#"A");
val it = 32: int
The latter example computes the difference between the ASCII codes for lower-
case a and capital A. This result is no coincidence. Every lower-case letter has
an ASCII code that is 32 more than its corresponding capital letter. O
Similarly, we can convert integers in the range 0 to 255 to characters. The
function chr performs this task as:
chr (97);
val it = "#a” : char26 CHAPTER 2. GETTING STARTED IN ML
2.2.4 Coercions Between Strings and Characters
If we have a character, we can convert it to a string of length one with the
operator str. That is,
str(#"a");
val it = "a": string
However, conversion from strings to characters is not so straightforward.
Part of the problem is that we have to deal with strings that are not of length
one. ML provides several ways to make the conversion where it makes sense.
For example, we shall see the explode operator in Section 2.4.5, which converts
a string to a list of characters.
2.2.5 Exercises for Section 2.2
Exercise 2.2.1: Write expressions to make each of the following conversions.
* a) Convert 123.45 to the next lower integer.
b) Convert -123.45 to the next lower integer.
©) Convert 123.45 to the next higher integer.
* d) Convert —123.45 to the next higher integer.
*e) Convert #"Y" to an integer.
f) Convert 120 to a character.
*1g) Convert #""N" to a real.
1h) Convert 97.0 to a character.
i) Convert #"2" to a string.
Exercise 2.2.2: The following expressions contain type errors. What are the
errors and how might we fix them?
* a) ceil(4)
b) if true then 5+6 else 7.0
* c) chr(256)
d) chr(“1)
* e) ord(3)
f) chr(#"a")
g) if 0 then 1 else 2
*h) ord("a")2.3, VARIABLES AND ENVIRONMENTS 27
2.3 Variables and Environments
In most languages, such as C or Pascal, computing takes place in an environment
consisting of a collection of “boxes,” usually called variables. Variables have
names and hold values. The name of a box is an identifier, which is a string of
characters (typically, letters and digits) that the language allows as the name of
a variable. There is usually a type associated with a variable, and the contents
of a “box” can be any value of the appropriate type. Pascal, C, and most other
languages allow variables of types integer, real, and many other types.
‘At any given time the set of values stored in the variables’ boxes constitute
the store. In conventional languages, computation proceeds by side-effects,
that is, by changing the store. One of the interesting things about ML is
that it is impossible for the store to change, with a few exceptions, such as
arrays and references, that we shall introduce in Chapter 7. Rather, ML does
its computing by adding to the environment new value bindings, which are
associations between identifiers and values. The above brief overview of this
section is heady material, so let's start again from the beginning.
2.3.1 Identifiers
Identifiers are character strings with certain restrictions. Most languages allow
identifiers that are letters followed by any number of letters and digits. ML
allows these too, along with many other strings that are not identifiers in most
other languages. In ML, identifiers fall into two classes: alphanumeric and
symbolic. There is no difference in their use, with the exception of type vari-
ables, as described below, which are alphanumeric identifiers beginning with an
apostrophe.
Alphanumeric Identifiers
The alphanumeric class of identifiers consists of strings formed by
1. An upper case or lower case letter or the character ? (called apostrophe
or “prime”), followed by
2. Zero or more additional characters from the set given in (1) plus the digits
and the character . (underscore).
However, identifiers beginning with the apostrophe ? are type variables. They
can only refer to types and cannot be bound to ordinary values.
Example 2.21: The following are examples of alphanumeric identifiers:
abe
X29a
Number_of_Hamburgers_Served
a’b’c28 CHAPTER 2. GETTING STARTED IN ML
The following is a legal alphanumeric identifier: *. However, it cannot be
bound to values like 3, 4.5, "six", or any of the values we normally think of
as the values of variables. It can only be bound to a type. In fact, ML often
chooses the identifier ’@ to represent the type of something whose value can be
of any type. For instance, ’a might in some contexts be given the type integer
as its “value.” Note that being bound to the type integer is quite different from
being bound to a particular integer like 3. 0
Symbolic Identifiers
Of all the characters we can type with a conventional keyboard, there are only
ten that cannot appear as part of some sort of identifiers. These ten characters
are the three kinds of pairs of parentheses (round, square, and curly), double
quote, period, comma, and semicolon. That is, the only characters that always
stand alone and cannot be part of an identifier are
QC) CITC". 4G
Of course the “white space” characters — blank, tab, and newline — also are
not part of identifiers. These do not have a meaning by themselves, but they
serve to separate the elements of a program.
The remaining 20 keyboard characters that cannot appear in alphanumeric
identifiers can be used to form symbolic identifiers. To be precise, the set of
characters for symbolic identifiers is
+-/*# <> reesy,rae se“ \I?:
Many of these symbols by themselves are names of operators. For example, we
have seen the use of +, *, and several others. ML interprets the identifier + as a
special function that adds either two reals or two integers. More precisely, ML
initially binds the identifier + to the addition function. Similarly, ML initially
binds any other symbolic identifier that stands for an operator to the function
implementing that operator.
We are free in ML to form our own identifiers from strings of the 20 charac-
ters listed above. These identifiers might be used to name new operators that
we define, but they can also be used routinely to name integers, reals, and so
on,
Example 2.22: The following are legal symbolic identifiers: $$$, >>>=, and
1e#%. However, !@a is not a legal identifier because it mixes the characters !
and @ (which may only appear in symbolic identifiers) with the character a
(which can only be part of an alphanumeric identifier). O2.3. VARIABLES AND ENVIRONMENTS 29
Exercise Care Using Symbolic Identifiers
We advise against using symbolic identifiers to represent values of types
such as integers or strings. Besides looking strange, they often cause trou-
ble because they must be surrounded by white space to prevent them
from “attaching” to operators like + and forming unintended identifiers
that confuse the ML system and cause an error. That is, although << and
a are both legal identifiers, we must write << +a or << + a to add them.
Should we write <<+a, we get the error message:
Error: unbound variable or constructor: <<+
If we use symbolic identifiers as program variables at all, they should be
used as functions, so the above lexical confusion will not occur.
2.3.2 The Top-Level Environment
When we invoke ML, we are given the top-level environment (see Fig. 1.1) in
which to work. In this environment, the identifiers that have meaning to the
ML system are bound to these meanings. The entire contents of the top-level
environment is enumerated in Sections 9.2 and 9.3.
In Fig. 2.4 we suggest some of these identifiers. Environments will be rep-
resented as a table, with a left column for identifiers and a right column for
the associated value. At the bottom of Fig. 2.4 is the top-level environment.”
We see an entry for the identifier ~ to represent the function that concatenates
strings, and we see another named floor that represents the floor function
discussed in Section 2.2.2. There are other entries for all the operators and
functions that we have learned and those we have yet to learn.
In addition, we have shown some other identifiers to which common values
have been bound as additions to the top-level environment. In particular, we
have the identifier foo bound to the integer 3, an identifier bar with value
‘equal to the integer 490, and an identifier pi that is bound to the real number
3.14159.
2.3.3 An Assignment-Like Statement
It is possible to add an identifier to the current environment and bind it to a
value. To do so we use a “statement” called a val-declaration, whose simplest
form is
? The top-level environment will always be at the bottom of environment diagrams, which
grow upward as new value bindings are made. ‘The term “top-level” is thus unfortunate,
but we hope the reader will find the convention of adding new bindings on top of old ones
intuitively appealing,30 CHAPTER 2. GETTING STARTED IN ML
foo 3
bar 490
pi 3.14159
function to
concatenate
strings Top-level
environment .
function to
floor compute the floor
of a real
Identifier Value
Figure 2.4: The top-level environment and some added user variables
val =
That is, we use the keyword val, the identifier for which we wish to create a
value binding, an equal sign, and an expression that gives the value we wish to
associate with that identifier.
Example 2.23: Here is an example of how the identifier pi shown in Fig. 2.4
might have been added to the environment.
val pi = 3.14159;
val pi = 3.14159 : real
Notice that in response to the val-declaration of variable pi, ML responds with
the value of pi rather than with the value of it, as was the case in all previous
examples. Otherwise, the response to a val-declaration is the same as the
response to an expression.
# In general, responses to val-declarations tell us the identifiers that have
been bound to values and what those values are.
We might next define an identifier radius as:
val radius = 4.0;
val radius = 4.0 : real
The val-declaration is actually considerably more general, and in place of single identifier
we can have arbitrary “patterns.” ‘The matter is discussed further in Section 3.3.42.3. VARIABLES AND ENVIRONMENTS 31
Some Points About ML “Assignment”
« Remember to use the keyword val to cause a value binding to oc-
cur. Assignment statements like x = y or x familiar from
other languages, are errors in ML (with one exception, discussed in
Section 7.3.3).
It is tempting to think of the equal-sign in a val-declaration as equiv-
alent to := in Pascal or = in C. However, these assignment operators
from other languages cause side-effects, namely the change in the
value stored in the place named on the left of the assignment opera-
tor. In ML, the val-declaration causes a newentry in the environment
to be created, associating what is to the left of the equal sign with
the value to the right of the equal-sign. Example 2.24 illustrates this
point.
Now we have some variables, namely pi and radius, that we can use along
with constants to form expressions. For instance, we can write an expression
that is the familiar formula for the area of a circle:
pi * radius * radius;
val it = 50.26544 : real
Similarly, we could introduce another identifier, say area, and use a val-
declaration to give it a value.
val area = pi * radius * radius;
val area = 50.26544 : real
Note that in the above example, the expression supplying the value itself in-
volves variables and operators. In previous examples the “expression” was a
single constant. O
2.3.4 A View of ML Programming
We now have a rudimentary view of what ML programs look like. They are
sequences of definitions, such as the val-declaration that associates values with
identifiers (which are loosely the same as “program variables”). So far, we don’t
have any really interesting assignments to make; we can only bind values of @
basic type (e.g., real or string) to identifiers, and we can ask for the value of an
expression involving these identifiers and constants, by typing that expression.
In Chapter 3 we shall see how to give identifiers values that are functions and32 CHAPTER 2. GETTING STARTED IN ML
how to apply functions to values in order to compute new values. When these
functions are recursive, we shall find ourselves programming in a mode that
gives us all the power of other programming languages, yet has a distinctive
flavor of its own.
Tt is natural to think of a val-declaration as an assignment, and often we shall
not go wrong if we do so. However, there is a subtle but important difference
in the way ML views what happens in response to a val-declaration. The next
example illustrates some of that difference.
Example 2.24: Suppose that after issuing the val-declarations of Example
2.23 we “redefine” radius to be equal to 5.0 by:
val radius = 5.0;
val radius = 5.0: real
We might imagine that the entry in the environment for radius has had its
value changed from 4.0 to 5.0. However, the proper ML view is suggested in
Fig. 2.5. Below the top entry is the environment that existed before radius
was “assigned” 5.0. We do not show all the identifiers that ML defines for us
(e.g., +), but we concentrate on those we have defined: pi, radius, and area.
Our View
radius 5.0
area 50.2654
radius 4.0 Existing before
pi 3.14159 val radius = 5.0
Figure 2.5: The environment after redefining radius
‘The topmost entry in Fig. 2.5 is an addition to the environment that re-
sults from the new val-declaration. We have shown in the current environment
two entries that are named by the identifier radius, but only the most recent
(upper) one is visible at this time. If we are running ML in interactive mode
and simply entering a sequence of val-declarations, then the earlier declaration
of radius cannot again become accessible through the current environment.
When we discuss functions and their effect on the environment in Section 3.2.1,
we shall see that it is sometimes possible to access a “buried” value binding
such as the lower entry for radius, just as it is in conventional languages such
as C or Pascal. O2.3. VARIABLES AND ENVIRONMENTS 33
Identifiers Do Not Have Fixed Types
Note that when creating an entry with an old name, as we did in Exam-
ple 2.24, there is no restriction that the new value be of the same type as
the old value. We could just as well have defined radius to be an integer
in Example 2.24, for instance:
val radius =
However, we then could not have used this variable radius in expressions
like pi * radius * radius, because of the type mismatch.
2.3.5 Exercises for Section 2.3
Exercise 2.3.1: Tell whether each of the following character strings is (i) an
alphanumeric identifier suitable for ordinary (nontype) values, (ii) a symbolic
identifier, (iii) an identifier that must represent a type as a value, or (iv) not
an identifier of ML.
* a) The7Dwarves
b) 7Dvarves
* c) SevenDwarves ,The
d) ’SnowWhite’
*e) aceb
f) hurrah!
*g) #1
h) 7123
Exercise 2.3.2: Show the effect on the environment of making the following
sequence of val-declarations. Which variables are now accessible?
val a= 3;
val b = 98.6;
val a = "three";
val ¢
a‘str(chr(floor(b)));34 CHAPTER 2. GETTING STARTED IN ML
2.4 Tuples and Lists
So far we have seen five types that ML values may have: integer, real, string,
character, and boolean. Most languages start with a similar collection of types
and build more complex types with a set of operators called type constructors,
which are dictions allowing us to define new types from simpler types. For
example, Pascal has, among other type constructors,
1. The record. . .end notation to build record types, whose fields may be of
any type,
2. The ~ operator to build a type whose values are pointers to values of some
simpler type, and
3. The array constructor that defines an array type, given a type for elements
and an index type.
ML also has a number of ways to define new types, including datatype
constructions discussed in Section 6.2 that go beyond what we find in C, Pascal,
or most other languages. However, the simplest and possibly most important
ways of constructing types in ML are notations for forming tuples, which are
similar to record types in Pascal or C, and for forming lists of elements of a
given type. In this section we shall learn these notations and also cover the
most important operations associated with these types.
2.4.1 Tuples
A tuple is formed by taking a list of two or more expressions of any types, sepa-
rating them by commas, and surrounding them by round parentheses. Thus, a
tuple looks something like a record, but the fields are named by their position
in the tuple rather than by declared field names.
Example 2.25: In the following val-declaration we assign to variable t a tuple
whose first component is the integer 4, whose second component is the real 5.
and whose third component is the string "si
val t = (4, 5.0, “six");
val t = (4, 5.0, "six”) : int * real * string
Let’s try to understand the ML response. It repeats the fact that the value
of t is the one we just gave it, which should be no surprise. However, it uses
terminology we have not seen before in an ML response, as it describes the
type of t. Recall from Section 2.2.1 that the type int * real * string is
a product type. Its values are tuples that have three components. The first
component is an integer, the second is a real, and the third component is a
string. The operator * has a different meaning when applied to types than it
does when applied to integer or real values. Here + has nothing to do with
multiplication, but indicates tuple formation. O24. TUPLES AND LISTS 35
In general, a product type is formed from two or more types Tis Tess Ti
by putting *’s between them, as T; * Tz * +++ * Tp. Values of this type are
tuples with & components, the first of which is of type T;, the second of type
Ty, and so on. Example 2.25 showed a case where k = 3, T; is int, T» is real,
and Ty is string.
Example 2.26: Here are some further examples of tuples and their types.
1. (1,2,3,4) is of type int * int * int * int.
2. (1,(2,3.0)) is of type int * (int * real).
3. (1) is of type int. Strictly speaking, it is not a tuple, just a parenthesized
integer.
In (2) the tuple has two components, the first of which is an integer. The second
component is itself a tuple with two components: an integer and a real. This
grouping is reflected in the type description. 0
‘The * operator applied to types is not an associative operator. For example,
int * (int * real) isnot the same type as (int # int) * real. The latter
type describes tuples of two components, the first of which is a pair of integers
and the second of which is a single real. For example, ((1,2) ,3.0) is a value of
type (int * int) * real. Neither is the same as the type int * int * real,
which describes “flat” tuples like (1,2,3.0).
2.4.2 Accessing Components of Tuples
Given a tuple or a variable whose value is a tuple, we can get any particular
component, say the ith, by applying the function #i.
Example 2.27: In Example 2.25, identifier t was bound to the tuple value
(4, 5.0, "six")
Now we can obtain its components. For example:
#1(t)
val it = 4: int
#3(t)
val it
"sie” : string
It is an error to apply a function like #4 that designates a component number
higher than the number of components the tuple has. 0
‘Tuples can be likened to records whose field names are the numbers 1, 2, -
In truth, tuples as we have defined them are a special, simplified case of a more
general record-structure construct that does allow the programmer to specify
names for fields. However, the tuple is adequate and quite convenient for most
purposes. We defer the more general case of record structures to Section 7.1.36 CHAPTER 2. GETTING STARTED IN ML
2.4.3 Lists
ML provides a simple notation for lists whose elements are all of the same type.
We take a list of elements, separate them by commas, and surround them with
square brackets.
Example 2.28: The list of three integers 1, 2, 3 is represented in ML by
[1,2,3]. The response of ML to an expression that is this constant value is
(1,2,3]5
val it = /1,2,3) : int list
‘The response to our list expression is informative. In addition to the usual
repetition of the value in the expression, it assigns the list the type int list,
which is ML’s way of saying ‘list of integers.” 0.
@ In general, “T List” is the type of a list of elements each of which is of
type T.
Example 2.29: In our second example, the list has a single element that is of
type string.
val it
a") : string list
The type attributed to the list expression is string list, or “list of strings.”
‘The fact that there is only one string in the list is irrelevant. ‘The square brackets
differentiate the expression "a", which is of type string, from the expression
('a"], which is a list of strings that happens to have only one string on the
list. 0
Example 2.30: Finally, here is an example where we erroneously try to mix
the types of elements of a list. We tried to write a list of three characters, but
we forgot the pound sign on the last one, so it became a string of length one,
instead of a character.
fea", eb", "o"];
Error: operator and operand don't agree [tycon mismatch)
operator domain: char * char list
operand: char * string list
in expression:
#0"
We shall explain the error message after we have learned some of the notation
of lists in Section 2.4.4. 02.4. TUPLES AND LISTS 37
2.4.4 List Notation and Operators
In this section we shall learn several operators that involve lists. These include
notation for the empty list, the head and tail of a list, “cons” or construction
of a list from a head and tail, and concatenation of lists.
‘The Empty List
‘The empty list, or list of no elements, is represented in ML by either the name
nil or by a pair of brackets, []
Head and Tail
Any list besides the empty list is composed of a head, which is the first element,
and a tail, which is the list of all elements but the first, in the same order.
Example 2.31: If L is the list (2,3,4], then the head of L is 2, and the tail
of L is the list [3,4]. If M is the list (5), then the head of M is 5, and the
tail of M is the empty list, or nil. O
We can get the head or tail of a list by applying the function hd or t1. to
the list, respectively. The following restates Example 2.31 in a sequence of ML
expressions.
Example 2.32: Suppose we define lists L and M by the val-declarations
val L = (2,3,4];
val L = [2,3,4] : int list
val M = [5];
val M = [5] : int list
Now we can get the head and tail of each of these lists as follows.
hd(L);
val it = 2: int
t1(L);
val it = [3,4] : int list38 CHAPTER 2. GETTING STARTED IN ML
In the last of these expressions, ML describes the type of nil as int List. It
is possible for nil to be of any list type. In this case, since it is the tail of an
integer list, it is appropriate to assign it this type. 0
Concatenation of Lists
While hd and t1 take apart lists, there are also two operators that construct
lists: concatenation and cons. We consider each in turn.
The concatenation operator for lists, which is @, takes two lists whose ele-
ments are the same type and produces one list consisting of the elements of the
first list followed by the elements of the second. Thus
(1,21@(3,4];
val it = [1,2,3,4) : int list
« Do not interchange the ~ operator, which is concatenation of strings, with
the @ operator, which is concatenation of lists.
Cons
The cons operator, represented by a pair of colons (::), takes an element (the
head) and a list of elements of the same type as the head, and produces a
single list whose first element is the head and whose remaining elements are the
elements of the tail. Thus
2:: (3,4)
val it = (2,3,4] : int list
The precedence of the and @ operators is below that of the additive
operators such as +, but above that of the comparison operators like <. Most
unusual is that these operators are right-associative, meaning that they group
from the right instead of the left as do most operators we have seen.
Example 2.33: Especially important about right-associativity of these oper-
ators is the interpretation of a cascade of cons operators, like
Ai:2::3:inil
‘This expression is grouped from the right, as 1::(2::(3::nil)). Expression
nil represents the list with head 3 and an empty tail, that is, [3]. Next,
:[3] is the list whose head is 2 and whose tail is the list whose only element
is 3; this list is [2,3]. Similarly, the entire expression denotes the list [1,2,3].
Notice that when we have a sequence of cons operators, only the last operand
must be a list, such as nil in the example above. The other operands must. be
elements. It would not make sense to eroup an expression like2.4. TUPLES AND LISTS 39
The Types of Heads and Tails
¢ Remember that the types of the head and tail are different. If the
type of the head is T, then the type of the tail is “list of 7,” or
T list in ML.
© Similarly, the cons operator :: takes a first argument that is of some
type T, and a second argument that is of type T list.
© On the other hand, the operator @ takes two arguments of type
T list for some type T.
Fg nil
from the left, as ((1 ::3)::mil, because 1::2 is a type mismatch. That
is, when the cons operator sees the left operand 1, it expects that the type of
the tail will be int list. Since the type of 2 is int, not int list, it is not
possible to apply :: to this pair of operands. ©
Example 2.34: Let us reprise Example 2.30 and consider the meaning of the
error message that we saw there. We repeat the relevant part of Example 2.30
in Fig. 26.
Cera", ab", "o"];
1) Error: operator and operand don’t agree [tycon mismatch)
2) — operator domain: char * char list
3) operand: char * string list
4) in expression:
5) BPD 70" 2 nil
Figure 2.6: Error message from Example 2.30
ML parses lists from the back (ie., the right end). It starts off assuming
a list is empty, i.e., nil. When it sees the last element, "c", it “conses” that
clement with the list following, i.e., "c" :: nil to get a list of one element,
("c"]. Evidently, the type of this list is string list, since its one element is
a string.
Now, ML tries to attach the next-to-last element, #"b", as the head of a
list whose tail is ("c"]. But there is a type mismatch in the resulting list
#"b" :: [Mc]. That is, since the head is a character, the expected domain
of the operator :: is char * char list, ie., a pair consisting of a character40 CHAPTER 2. GETTING STARTED IN ML
(the head) and a list of characters (the tail).4 That is what line (2) of the error
message is telling us. However, as line (3) states, the actual type of the pair to
which the :: operator was applied is char * string list, i.c., ahead that isa
character and a tail that is a list of strings. Lines (4) and (5) of the error message
confirm that the problem occurs in the expression #"b" imil oO
2.4.5 Converting Between Character Strings and Lists
In ML, strings and lists are different types. However, there is a great similarity
between a string and a list of characters, and it is possible to convert between
the two representations using the built-in functions explode and implode. The
first of these takes a string and converts it to the list of characters appearing
in that string, in order.
Example 2.35: Here are two examples of the use of explode.
explode("abed") ;
val it = [##"a", #0", #"C", #7") : char list
explode("");
val it = [J : char list
Notice in the second example that "is the empty string, which when exploded
yields an empty list of characters.
The function implode takes a list whose elements are characters and con-
catenates all the characters together to form a single string.
Example 2.36: Here are three examples of imploding lists.
implode ([#"a",#"b", #"c",#"d"]) 5
val it = "abed” : string
implode (nil)
val it = "” : string
implode (explode ("xyz"));
val it = "syz” : string
The second example points out that we can implode the empty list and get the
empty string. The third illustrates that implode and explode are inverses of
one another, and the effect of explode followed by implode on any string is to
return the string itself. 02.4. TUPLES AND LISTS a
The Type of the Empty List
Notice that in the second case of Example 2.35, ML deduced"that the
empty list was of type char list, even though there are no elements in
the list. ML knows it is an empty list of characters, because explode
always returns a list of characters.
In general, the type of the empty list is "a list, i.e, a list of elements
of any one type. Recall from Example 2.21 that identifiers beginning with
a quote mark denote types. Thus, a list is ML’s way of saying “any-
type list.” However, when the empty list appears as a value, the ML
system must be able to discover a concrete type for its elements — the
type that the elements would have if there were any elements. We shall
have more to say about the need to resolve types in Section 5.3.1.
A third operator, similar to implode, works on lists of strings instead of
lists of characters. If L is a list of strings, then concat (L) produces the string
that is the concatenation of all the strings on L, in order.
Example 2.37: Here is an example of concat applied to a list of strings.
concat (["ab","cd","e"]);
val it = "abede” : string
ao
2.4.6 Introduction to the ML Type System
Every programming language has a type system, that is, a collection of types
for its values and variables and a way of expressing those types. We have not
seen nearly all of the ML type system yet, but it is useful to observe the way
types and their representations are constructed. The type system of ML is
constructed from a basis of elementary types by applying certain type construc
tors recursively. A type constructor is an operator that builds new types from
simpler ones. Here is what we have seen so far of the ML type system.
BASIS: We have seen the elementary types int, real, bool, char, and string.
INDUCTION: We have seen two type constructors:
1. The product-type constructor builds the types of tuples. If T;,T2,---,;Tn
are types, then T; * Tz * --- * Ty denotes the type of a tuple whose
ith component has type Tj, for i= 1,2,...,n.
“Recall that all binary operators in ML are perceived as applying to a single pair, rather
than to two arguments,42 CHAPTER 2. GETTING STARTED IN ML
2. The list-type constructor List builds list types from element types. If T
is a type, then T list is the type of lists each of whose elements is of
type T.
‘We may apply these type constructors in any order, as many times as we like,
to build new types of increasingly complex structure. In type expressions, List
is of higher precedence than *. Thus, we may need to use parentheses to group
operands properly.
Example 2.38: Here are some examples of constructed types and a typical
value for each.
1. ‘Type expression int List is a list of integers. It is the appropriate type
for values such as (1,2,3].
2. Type expression string * int list * int is the type for a tuple with
three components, whose types are respectively a string, a list of integers,
and a single integer. A typical value of this type is ("ab", [1,2,3], 4).
Note that in type expressions, List has higher precedence than *, so this
type expression is properly parsed string * (int list) * int, rather
than (string * int) list * int.
3. Type expression (int * int) list list is the type of a list of lists of
pairs of integers. An appropriate value for this type is
(£(1,2),(3,4)], ((5,6)], nil]
The list consists of three elements. The first element is the list consisting
of the pairs (1,2) and (3,4). The second element is the list with only one
element: (5,6). The third element is the empty list.
o
2.4.7 Exercises for Section 2.4
Exercise 2.4.1: What are the values of the following expressions?
* a) #2(3,4,5)
b) hd([3,4,5])
*c) t1((3,4,5])
d) explode("fo:
*e) implode([#"f", #"o", #"0"])
f) "co" ‘124. TUPLES AND LISTS 43
* 8) [re
h) coneat(["e","a","t"])
o"JOL"b","o","1"]
Exercise 2.4.2: What is wrong with each of the following expressions? If
possible, suggest an appropriate correction.
* a) #4(3,4,5)
b) haf)
*1c) #1(1)
d) explode(("bar"])
* 0) implode( "b")
f) [rv]: ('a","e"]
*g) t1(01)
h) 102
* i) concat ([#"a",#"b"])
Exercise 2.4.3: Give the types of the following expressions.
*a) (1.5, ("3", (4,5]))
b) ((1,2] nil, (3]]
*c) ((2,3.5), (4,5.5), (6,7.5)]
d) ( "), Cni2, (1,2,3])
“! Exercise 2.4.4: Are (1,2) and (1,2,3) the same type? Are [1,2] and
[1,2,3] the same type?
ve
Exercise 2.4.5: Give examples of appropriate values for each of the following
type expressions. Do not use the empty list as the value for any list component.
*a) int list list list
b) (int * char) list
* c) string list * (int * (real * string)) * int
d) ((int * int) * (bool list) * real) * (real * string)
*e) (bool * int) * char.
!f) real * int list list list list.
! Exercise 2.4.6: Using two of the operators we have learned in this section, it
is possible to convert a string of length one into the character of that string.
Show how to accomplish this transformation.Chapter 3
Defining Functions
Now we know everything there is to know about ML, except how to program!
In this chapter we shall learn about defining and using functions. Essentially
all programming in ML is conducted by the definition of functions and the
application of these functions to arguments. As we shall see, ML uses functions
in places where more traditional languages use iteration (e.g., while-loops).
3.1 It’s Easy; It’s fun
‘The keyword fun introduces function definitions. In this section we shall see
the simplest form of function definitions, which are essentially single expressions
that are evaluated for the arguments of the function whenever the function is
called. Later sections discuss the more common forms of function definition,
involving the matching of arguments to patterns, and the use of temporary defi-
nitions in functions. We defer to Chapter 5 some of the more advanced concepts
regarding ML functions, such as polymorphic functions (those that can take ar-
guments of different types), higher-order functions (those that take functions
as arguments or produce functions as results), and Currying of higher-order
functions (writing a function so new functions may be created by instantiating
one of its arguments).
‘The simplest form of function declaration is
fun () = ;
‘That is, the keyword fun is followed by the name of the function, a list of the
parameters for that function, an equal-sign, and an expression involving the
parameters. This expression becomes the value of the function when we give
the function arguments to correspond to its parameters.
Example 3.1: Let us define a function upper that converts a lower-case letter46 CHAPTER 3. DEFINING FUNCTIONS
to the corresponding upper-case letter.! To do so, we need to know that the
ASCII code for an upper-case letter is always 32 less than that of the lower-case
version. The function upper will perform the following steps:
1. A given character is converted to an integer,
2. 32 is subtracted, and
3. The result is converted back to a character.
Here is the definition of function upper:
fun upper(c) = chr(ord(c)-32);
val upper = fn: char -+ char
There are a number of observations we should make about the function
upper. First, it has one parameter c. The value of upper is computed by the
expression chr(ord(c)-32). We discuss the ML response to the definition of
upper in Section 3.1.1.
We may use function upper to convert lower-case letters, just as we would
in most languages, by applying upper to the desired letter. For instance, we
can convert #"a" to by:
upper (#"a") ;
val it = #A” : char
ML responds to expression upper (#"a") as it would to any expression, by
assigning its value to it and telling the value. O
3.1.1 Function Types
Notice from Example 3.1 how ML represents function types. The response to
the definition of function upper was
val upper = fn: char -+ char
In general, when a function is defined, ML does not respond with the value
of that function, which is hard to express other than by repeating the defini-
tion of the function. Rather, it responds with the type of the function. The
specification of the function type has the form:
fn : ->
That is, the response to a function definition consists of:
V Phere is actually a function toUpper that is available in a library of functions that ML calls
the Char “structure.” We shall cover structures in Section 8.2 and the particular structure
Char in Section 9.4.4, Function topper also has an inverse function, toLower, which converts
upper-case letters to their corresponding lower-case letters. Moreover, each of the functions
‘topper and toLover leave intact those characters that are not lower- or upper-case letters,
respectively, which our simple function upper does not do.3.1. IT’S EASY; IT’S FUN 47
Function Parameters and Arguments
In this book we call the variables to which a function is applied in its
definition the parameters, while the expressions to which the function is
applied in a function call are arguments. In other literature, one sometimes
sees the terms “formal parameters” and “actual parameters” where we use
“parameters” and “arguments.”
1. The keyword fn.
2. A colon.
3. The type of the parameter(s), called the domain type for the function.
This type is char in Example 3.1. ML regards each function as having
one parameter, but the type of this parameter can be a product type. So
in practice there can be any number of parameters for a function.
4, The symbol ->.
5. The type of the result of the function, that is, the range type for the
function. In Example 3.1, the range type is also char, but it is common
for the domain and range types to differ. ML views each function as
returning a single value, but since this value may be a tuple, in effect a
function can return several items.
The operator -> is another way to construct types, just like * and the word
List. If T; and Tp are types, then T; + Tp is the type of functions with domain
type 7; and range type T», that is, functions which take an argument of type
T; and return a result of type T>.
Operator -> is right-associative, so T, -+ Tz + Ts is interpreted as
Ty > (Te > Ts)
and is the type of a function whose parameter is of type T; and whose result
is itself a function; that function has domain type T2 and range type Ts. The
notion of a function producing a function as a value may seem strange, but
these “higher order functions” are an integral part of ML programming that we
shall examine starting in Section 5.4.
3.1.2 Declaring Function Types
It might surprise the reader that we never had to declare the type of the param-
eter c in the function upper of Example 3.1 or the type of the value returned
by this function. ML deduced that these types are both char because of what
it knows about the functions ord and chr. In general ML does not require48 CHAPTER 3. DEFINING FUNCTIONS
Don’t Confuse fun With fn
The response in Example 3.1 uses the keyword fn, which should not be
confused with fun, even though both are short for “function.” We use
fun to introduce a declaration of a particular identifier to be a certain
function, while fn is used in ML to introduce a value that has a function
type.
declarations for types, although you are free to declare the type if you wish.
We shall have more to say about how ML deduces types in Section 3.2.4, and
there we shall get a better idea of when we can rely upon ML to deduce types
for us.
The most common situation in which we have to declare a type is when ML
would use the default rule for an arithmetic or comparison operator to deduce
that certain variables were of integer type, and yet we want these variables to
be of some other type on which the operator can be used. If we need to, we can
follow any variable or expression by a colon and a type. The effect is to declare
that variable or expression to have that type. Recall that the colon symbol is
also used in ML responses to connect values with their types.
Example 8.2: Our next example is a function that squares reals.
fun square(x:real) = x#x;
val square = fn: real ++ real
The function square has one parameter, x. By following parameter x with
a colon and the type real, we declare to ML that the parameter of function
square is of type real. ML then infers that the expression x4x represents real
multiplication, and therefore the value returned by square is of type real.
It is necessary to indicate that x is real somewhere. Otherwise, ML will use
the default type, integer, for x, resulting in a function that can square integers
but not reals:
fun square(x) = x#x;
val square = fu: int > int
We could have attached the :real to any or all of the three occurrences of
x in the definition of square in Example 3.2. For example
fun square(x) = (x:real)*x
is a possibility. However:3.1. IT’S EASY; IT’S FUN 49
‘¢ We must be careful to parenthesize the arguments of the colon operator
for example, (x:real) — because the colon has lower precedence than
the arithmetic or comparison operators.
Example 3.3: Some care must be exercised in how we specify the types of
variables in a function definition. Here is an example of a surprising error that
can occur if we do not group a variable with its type properly.
fun square(x) = x:real * x;
Error: unbound type constructor: <
Here, because * has higher precedence than :, ML has tried to “multiply”
real by the third of the three x’s before applying the operator :. That is not
as strange as it seems. ML knows real is a type, and * applied to types forms
a product type. That is, ML is trying to form a type consisting of pairs whose
first component is of type real and whose second component is of type x. But
it doesn’t know about any type named x, so it complains. The solution, which
we used following Example 3.2, is to parenthesize the x:real so ML will group
its operators as we intend. O
3.1.3 Function Application
As an example of the use of the square function, suppose we have defined the
variables pi and radius to have values 3.14159 and 4.0, as in Example 2.23.
Then we can write
pitsquare(radius);
val it = 50.2654 : real
In this example, function application looks just like it does in Pascal or most
languages; a function is applied to a list of arguments, with parentheses around
the argument list. However, as we discussed in the box on “Applying Functions,
ML Style” in Section 2.2.3, formally, the ML syntax for function application is
simply a pair of expressions standing next to one another, with no intervening
punctuation. That is, F E requires the expression F to be evaluated and
interpreted as a function. Then, expression E is evaluated and function F is
applied to the value of E.
Example 3.4: We could have computed the area of a circle by
pi * square radius;
val it = 50.2654 : real
Function application has higher precedence than any of the arithmetic opera-
tors, so the above expression first applies function square to argument radius,
and the result is multiplied by pi. 050 CHAPTER 3. DEFINING FUNCTIONS
In principle, it doesn’t matter whether or not we put parentheses around a
simple argument (i.e., an argument without operators); that is, £ x and (x)
are treated the same by ML. However, we advise using the parentheses. Not
only do parentheses make the syntax of function application look more familiar,
but sometimes they prevent an error such as failure to put parentheses around
an operand and its type, to which the operand is connected by the : symbol.
Example 3.5: To underscore the point that function application has higher
precedence than the common operators, consider the sequence of statements
below, ending in an application of the function square from Example 3.2.
val x = 3.0;
val y = 4.0
square x+y;
val it = 13.0 : real
We see from the value produced that ML has grouped the function application
(square x)+y; that is, function square is applied to x before the addition with
y takes place. If we want to square the sum of « and y, then we are required
to use parentheses,
square (xty);
val it = 49.0 : real
as we would in most other languages. 0
3.1.4 Functions With More Than One Parameter
We can define a function that has any number of parameters. Normally, we put
parentheses around the list of parameters or arguments, both in the function
definition and use. The effect is to combine the list of arguments into a tuple,
which formally is a single argument but which we may treat as if there were
several arguments. It is also possible to write multiparameter functions without
parentheses; see Section 5.5.
Example 3.6: Figure 3.1 is another example of a function; it produces the
largest of three real numbers. It begins by comparing parameters a and b in
line (2). If ais larger, it returns as a result the larger of a and ¢ at lines (3)
and (4). If b is larger, then in lines (6) and (7) it returns the larger of b and c.
a
Example 3.6 brings up a number of important points about ML types.
© Notice that in Fig. 3.1 ML deduces that b and ¢ are reals, even though
only a was declared. One way to make this deduction is to use the fact
that the if --- then --- else operator must have the same type in both
branches. We shall discuss type deduction further in Section 3.2.4.3.1. IT’S EASY; IT'S FUN sy
(1) fun max3(a:real,b,c) = (* maximum of three reals *)
(2) if a>b then
) if a>c then a
(4) else
(8) else
(6) if bec then b
mM else ¢}
val mar3 = fn : real * real * real real
Figure 3.1: Function computing the maximum of its three arguments
Comments
Now that we can write programs of more than one line, we shall have
reason to comment our code. The proper way to do so is shown in line
(1) of Fig. 3.1. The pair of characters (+ introduce a comment, which
continues, even across lines, until the matching sequence of two characters
*) is encountered. This convention is similar to most implementations
of Pascal, but in ML it is possible to nest pairs of (+...#), just like
parentheses are nested.
* Also notice that the type of max3 is a function that takes a triple of real
numbers as its argument and produces a real. That type is shown in the
response as real * real * real -> real
« In type expressions * takes precedence over ->. Thus in the type expres-
sion above, the domain type is real + real * real, and the range type
is real.
© If we did not declare the type of any variable in function max3, then ML
would assume the variables compared by > have the type integer, the
default type for operator >
# One advantage of the ML view that functions have only one parameter
is that a variable whose value is of the appropriate product type can
be defined and used as the argument of a multiparameter function. For
example,
val t = (1.0,2.0,3.0);
max3(t.
is correct ML and produces the value 3.0.52 CHAPTER 3. DEFINING FUNCTIONS
3.1.5 Functions that Reference External Variables
The functions illustrated so far use only their parameters in computing their
result. Sometimes, we wish to write a function that uses previously defined
variables in its body. As in most other languages, the use of a variable x in
the definition of a function f “freezes” x as far as the function f is concerned.
That is, subsequent redefinitions of « will not affect the function f. A simple
example will illustrate the rule,
Example 3.7: Consider the following sequence of steps:
1) val x = 3;
2) fun addx(a) = atx;
3) val x = 10;
4) addx(2) 5
int
val it = 5:
A picture of the changes to the environment is shown in Fig. 3.2. At line (1)
we create a variable x and give it the value 3. When at line (2) the function
adax is defined, its definition uses xr. We suggested by an arrow in Fig. 3.2 that
the definition of addx refers to this value binding for x. Remember that, as
we suggested in Section 2.3.4, the value binding for «, after being added to the
environment at line (1), never changes. Thus, we can be sure that the definition
of addx will always use 3 as the value of 2.2
Now, in line (3) we create a new variable, also named x. As we see in Fig. 3.2,
the new binding for x goes above the old binding for x and the definition of
addx. However, the definition of addx does not change; it continues to refer to
the value of z that pertained when the definition of addx was made. Thus, we
see that in line (4), addx(2) results in the value 5, not 12, because the value of
xin the definition of addx is still 3.
3.1.6 Exercises for Section 3.1
Exercise 3.1.1: Write functions to compute the following:
* a) The cube of a real number x
b) The smallest of the three components of a tuple of type int * int * int.
* c) The third element of a list. ‘The function need not behave properly if,
given an argument that is a list of length 2 or less.
‘You may therefore wonder why we would want to write adéx as we did. Indeed,
fun addx(a) = a+3 would be a simpler way to write the same function. There are some
good reasons to write functions that refer to external variables, however. For example, in
Section 8.1.2, we consider a collection of functions that all use the same external variable. If
‘we change this variable, all the functions change in a coordinated way.3.1. IT’S EASY; IT’S FUN 53
x 10 Added at line (3)
definition Added at li
addx of adax \ Added at line (2)
x 3 ) Added at line (1)
Previous
environment
Figure 3.2: Changes to the environment in Example 3.7
d) The reverse of a tuple of length 3.
* ©) The third character of a character string. Your function need not behave
well on strings of length less than 3. Hint: Use explode and your function
from Exercise 3.1.1(c).
£) Cycle a list once. That is, given a list [a,,a2,...,n], produce the list
[a2,a3,..-44n,aa)-
! Exercise 3.1.2: Write functions to do the following.
* a) Given three integers, produce a pair consisting of the smallest and largest.
b) Given three integers, produce a list of the three in sorted order,
c) Round a real number to the nearest tenth.
!d) Given alist, return that list with its second element deleted. Your function
need not behave well on lists of length shorter than 2.
Exercise 3.1.3: Suppose we execute the following sequence of definitions:
vala=
fun f(b) = a*b;
val b=
fun g(a) = a+b;
Give the value of the following expressions:
*a) £(4).bd CHAPTER 3. DEFINING FUNCTIONS
When and Where Function Definitions Occur
The behavior of ML regarding the value of variables used in a function
definition is essentially the same as the policy followed by C or Pascal, or
most other languages. For example, in C a function’s definition can refer
to static variables defined prior to the function definition in whatever file
the function definition appears. That is, the variables a function definition
can use depends on where the definition appears in a C program.
It might appear that which variables are usable by a function defini-
tion in ML depends on when the function is defined. That is, the function
can refer to previously defined variables. However, the apparent difference
is caused by the fact that we have been thinking of ML programming as
done in interactive mode, where steps are entered one at atime. If we think
of what we type in interactive mode as a single file of program elements,
then we see that ML follows the same rule as C — you may use variables
that are located above the function definition in the file. This rule applies
exactly if we write ML programs in files and execute them using use, as
we discussed in Section 1.2. There are, however, two differences between
ML and C in this regard:
1. In C, the value of a variable can change, thus changing what the
function does; in ML the value cannot change.
2. In ML, it is possible to make several declarations for the same iden-
tifier, external to any function. In C, that would be considered an
illegal redefinition.
*b) £(4) 4d.
©) g(5)
d) g(5)+a.
*e) £(g(6))
f) g(£(7)).
3.2 Recursive Functions
It is possible, and indeed frequently necessary, for ML functions to be recursive,
that is, defined in terms of themselves, either directly or indirectly. In fact, re-
cursive functions in ML substitute for most of the iterations such as while-loops
or for-loops than one finds in C, Pascal, and most other languages. Looping3.2, RECURSIVE FUNCTIONS 55
statements, while present in ML (See Section 7.3.4), are awkward and generally
discouraged.
When writing recursive functions, we must be careful that if a recursive
function calls itself, it does so with an argument that is, in some sense chosen by
the programmer, smaller than its own argument. For example, if the argument
is an integer i, we could safely call the function with argument i — 1 or any
integer smaller than i, If the argument is a list L, we could call the function on
the tail of the list or any shorter list.
Normally, a recursive function consists of
1. A basis, where for sufficiently small arguments we compute the result
without making any recursive call, and
2. An inductive step, where for arguments not handled by the basis, we call
the function recursively, one or more times, with smaller arguments.
In this section we shall learn about writing simple recursions. We then in-
troduce two extensions: nonlinear recursion, where the recursive function calls
itself several times, and mutual recursion, where several functions are defined
recursively in terms of each other. We begin with a simple example of a recur-
sion.
Example 3.8: Let us write a function reverse(L) that produces the reverse
of the list L.? For example, reverse([1,2,3]) produces the list [3,2,1].
BASIS: The basis is the empty list; the reverse of the empty list is the empty
list.
INDUCTION: For the inductive step, suppose L has at least one element. Let
the first or head element of L be h, and let the tail or remaining elements of L
be the list T. Then we can construct the reverse of list L by reversing T and
following it by the element h.
For instance, if L is [1,2,3], then h = 1, T is [2,3], the reverse of T is
[3,2], and the reverse of T concatenated with the list containing only h is
(3,2]@(1), or (3,2,1].
(1) fun reverse(L)
@) if L = nil then nil
(3) else reverse(t1(L)) @ [hd(L)];
val reverse = fn: ’a list + 'a list
Figure 3.3: A recursive function to reverse a list
In Fig. 3.3 we see the ML definition of reverse that follows the basis and
inductive step described above. Lines (2) and (3) are the expression that forms
SML actually has a built-in function rev that performs this operation.56 CHAPTER 3. DEFINING FUNCTIONS
When Does a Function Need to Know Its Type?
Given our discussion in Section 3.1.2, it might surprise you to find that
in Example 3.8 it was not necessary for the particular type of elements
to be deduced by the ML compiler. The difference between Example 3.8
and previous examples of functions that work on parameters of only one
type, is that some functions use an overloaded operator such as + or < that
require us to tell ML what type its operands have (or to use the default
type for the operator). In Example 3.8, there is no overloaded operator,
and thus, we were able to avoid specifying the types of elements of the
list. We shall discuss in Section 5.3 more about when a function needs to
know the type of its operands and when it can be “polymorphic,” working
‘on values of various types.
the body of the function definition. In line (2) we handle the basis case: the
reverse of the empty list is the empty list. Line (3) covers the inductive step, and
we should appreciate how succinctly and naturally it does so. The subexpression
reverse(t1(L)) takes the tail of the given list and reverses it, recursively. We
then concatenate this new list with the head element, which is obtained by
subexpression hd(L).
* In order to concatenate the reversed tail with the head element, we must
place square brackets around the head element, as (hd(L)]. Remember
that the concatenation operator @ requires two lists as its arguments. If
we were to omit the square brackets, we would be concatenating a list
and an element, leading to a type mismatch.
The response to the definition of reverse in Fig. 3.3 illustrates an interesting
point. Unlike our previous examples of functions, ML cannot tell exactly what
the type of argument and result is. It can only deduce that these types are both
lists of elements of the same type. It calls the element type ’a, and it calls the
argument and result types ’a list. The type of reverse is then a function
from ’a lists to ’a lists.
3.2.1 Function Execution
Whenever a function is called, its arguments are evaluated, and an addition to
the environment is created that associates the resulting values with the param-
eters of the function. This style of argument passing is known as call-by-value.
‘Recall that identifiers beginning with a quote are variables denoting types. Actually, the
type variable used by ML in this example is *a (i.e., two quotes before the a). There is a
subtle distinction between 'a and ’¥a, which we shall discuss in Section 5.3.4. Before then,
we shall use only *a, * and so on as variables denoting types.3.2. RECURSIVE FUNCTIONS 57
It is the same as the manner by which arguments are passed to functions and
procedures in C, and the manner in which non-var parameters are handled in
Pascal.
When the function is executed, we place on top of the old environment
entries that bind the parameters of the function to their associated values. If
the function is recursive, new additions are built on top of the old ones for
each recursive call. Each addition binds the parameters of the function to the
argument values. These bindings intercept any reference to the parameters,
thus distinguishing themselves from the entries with the same identifiers in
levels below. When a function completes and returns its value, its addition to
the environment goes away, but the returned value is available for use in the
expression being evaluated.
Example 3.9: Suppose we are in an environment that has the definition of
the function reverse from Example 3.8. If we call
reverse([1,2,3])
then we add to the environment an entry for parameter L and its value. We
show this first step above the line in Fig. 3.4.
Added in call to
‘ 1,2,3] reverse((1,2,3])
reverse | definition of Environment
reverse before call
Figure 3.4: Environment after the initial call to reverse
With this value of L as argument, the condition of line (2) in Fig. 3.3 is
false; that is, L is not nil. Thus, we must evaluate the expression on line (3),
which requires us to evaluate reverse(t1(L)) or reverse((2,3]). Thus we
set up another call to reverse, adding to the environment a new binding for L
that associates L with the value [2,3].
Ina similar manner, the new call to reverse causes us to make another call,
with L bound to [3], and an addition to the environment is set up with this
binding. Again a recursive call to reverse is necessary, and in the fourth call L
is bound to nil. The additions to the environment for all four calls are stacked
one above the other as suggested in Fig. 3.5. At this point, the identifier L
refers to the top binding, with value nil.58 CHAPTER 3. DEFINING FUNCTIONS
Current
View
; Added in call to
L nil reverse(nil)
Added in call to
L (31 reverse([3])
Added in call to
L [2,3] reverse([2,3])
Added in call to
L (1,2,3] reverse((1,2,3])
definition of i
reverse definition Environment
before call
Figure 3.5: Additions to the environment when four calls to reverse are made
Now when we evaluate the body of reverse, the test of line (2) is satisfied,
because L has the value nil. The value nil is returned and used in place of
reverse(t1(L)) by the call below it — that is, by reverse([3]) — to produce
its own answer on line (3). After the return, the top entry for L in Fig. 3.5
disappears, exposing the appropriate value of L, namely [3]. Since hd([3])
is 3, the result produced by reverse([3]) is the empty list concatenated with
[3], or just [3].
Now, the addition to the environment for reverse([3]) goes away, and its
result is used by the call below it: reverse([2,3]). That, in turn, produces
[3,2] as a result and its addition to the environment goes away, leaving the
environment that was originally shown in Fig. 3.4. However, the corresponding
call, reverse([1,2,3]), now receives the value [3,2], returned from above, to
use in place of reverse(t1(L)) in line (3). Thus the original call to reverse
is able to produce its value, [3,2,1]. At this point, all bindings for L have
disappeared. O3.2. RECURSIVE FUNCTIONS 59
3.2.2 Nonlinear Recursion
‘The form of recursion illustrated in Examples 3.8 and 3.9 is relatively simple.
Each call either results in one recursive call with a smaller argument, or we
reach the basis case and there is no need for a recursion. Now we shall examine
a function where the recursion involves more than one recursive call.
The function combinations of m things out of n or “n choose m,” usually
written (7), is the number of ways we can pick a set of m things out of n
distinct things. For example, two aces out of the four aces in a card deck can
be picked in six possible ways. That is, we can pick any of the four aces first
and any of the three remaining aces second. That looks like 12 ways, but in
fact we have picked each set in two different orders. For example, the aces of
spades and hearts could be picked spade-then-heart or heart-then-spade.
In general, (") = n!/((n —m)!m!), where 2! (« factorial) is the product of
all the integers from 1 up to x. For instance,
(i) = 41/(212) = 4x 3x 2x 1/(2x1% 2x1) =6
Intuitively n!/(n — m)!, which equals n x (n — 1) x +++ x (n — m+ 1), is the
number of ways we can select among n things for the first choice, then among
the n—1 remaining things for the second choice, and so on for m choices. We
must divide this number by m! because each set of m elements will have been
selected in m! different orders.
There is also a natural recursive way to define ("). Here are the basis and
induction rules.
BASIS: There are two parts to the basis. If m = 0, then the number of ways to
pick 0 things out of n is 1 — don’t pick anything. Thus, (") = 1 for any n > 0.
Also, if m =n, then there is one way to pick all n things out of n — pick them
all. Thus, (") = 1 for all n > 0.
INDUCTION: If 0 < m n, so this basis and induction
entirely define the function.
Example 3.10: We can write a function comb(n,m) that computes (7). The
code appears in Fig. 3.6. Line (2) handles the basis case, and line (3) implements
the inductive step. Note that the program will not behave well if the assumption
about n and m in the comment of line (1) is violated. We really should test for
violations, and there is an important mechanism, the “exception,” that allows60 CHAPTER 3. DEFINING FUNCTIONS
us to do so and still adhere to the principle that functions return a value of one
particular type invariably. We discuss exceptions in Section 5.2. 0
(1) fun comb(n,m) = (« assumes 0 <= m <= n *)
(2) if m=0 orelse m=n then 1
@) else comb(n-1,m) + comb(n-1,m-1) ;
val comb = fn : int * int + int
Figure 3.6: Function to compute n choose m_
The sequence of recursive calls initiated by a single use of function comb is
rather complex. For example, in the expression
comb(4,2);
val it = 6: int
the initial call first calls comb(3,2) and later calls comb(3,1). However, be-
fore the latter call, comb(3,2) calls comb(2,2) and comb(2,1), and so on.
Figure 3.7 shows the structure of the calls as time progresses from left to right.
G) @)
()
Figure 3.7: Structure of recursive calls for the comb function
3.2.3 Mutual Recursion
Occasionally, one needs to write two or more functions that are mutually re-
cursive, meaning that each calls at least one other function in the group. Most
languages, such as Pascal or C, put some obstacles in the way of writing such
functions, but ML has a straightforward mechanism. We shalll give an example
of mutually recursive functions, first showing the problem that. arises if we are
not careful. Then, we shall show how ML lets us handle the problem.
Example 3.11: Suppose we want to write a function that takes a list L as
argument and produces a list consisting of alternate elements of L. There are
two natural versions of this function. One, which we call take(L), takes the
first element of L and alternate elements after that (i.e.. the first. third. fifth.3.2. RECURSIVE FUNCTIONS 61
and so on). The other, which we call skip(L), skips the first element and takes
alternate elements after that (i.c., the second, fourth, sixth, and so on). It is
convenient to define these two functions in terms of each other.
BASIS: If L is empty, both functions return the empty list.
INDUCTION: If L is not empty, take returns the head clement of L followed
by the result of applying skip to the tail of L. On the other hand, skip returns
the result of applying take to the tail of L.
Figure 3.8 shows a failed attempt to define the functions take and skip.
At the third line of take, we assemble the result using the cons operator; the
head is the head of L, and the tail is the result of applying skip to the tail of
L. The problem is that at the third line, the function skip is not defined, even
though we intend to define skip immediately thereafter. Thus, ML responds
with an error message. Defining skip first would cause a similar error because
take is used in the third line of skip. O
fun take(L)
if L = nil then nil
else hd(L)::skip(t1(L));
‘unbound variable or constructor: skip
Error
fun skip(L) =
if L = nil then nil
else take(t1(L));
Figure 3.8: Erroneous attempt to define mutually recursive functions
‘We can get ML to wait until it has seen both functions take and skip before
trying to interpret variables, if we use the keyword and between the function
definitions. The general form for defining n mutually recursive functions is
shown in Fig. 3.9. There we see the n definitions connected by and’s. There is
one use of fun at the beginning and one use of the semicolon, at the end.
© Do not confuse and, which is used to indicate mutual recursions, with
andalso, which is the logical AND operator in ML.
© It is not necessary to use the and construct if there is no mutual recur-
sion. If we define functions fi, fo,..-, fn, and, for each i, in the definition
of f; we use only functions that appear earlier on the list — that is,
fis fos-++ fi-a — then there is no mutual recursion.
Example 3.12: The correct definition of the functions take and skip from
Example 3.11 is shown in Fig. 3.10. Notice that the response from ML does not62 CHAPTER 3. DEFINING FUNCTIONS
fun
and
and
and
;
Figure 3.9: Form of a mutually recursive function definition
come until after both functions have been seen. Both are identified as functions
from lists to lists. The elements of the input and output lists of both functions
must be of one type ’a, but ML cannot identify the type.
fun
take(L) =
if L = nil then nil
else hd(L)::skip(t1(L))
and
skip(L) =
if L = nil then nil
else take(tl(L));
fn: ‘alist + 'a list
fn: ‘alist + 'a list
Figure 3.10: Correct definition of mutually recursive functions
Here are two examples of the use of these functions.
take(([1,2,3,4,5]);
val it = [1,3,5] : int list
skip([#"a",#"b" ,#"c" ,#"d",#"e"]) 5
val it = [#"b",#"d"]
char list
When we use the functions on particular lists, ML can figure out the type of
list elements from the argument. Hence, the type of the result list is reported
with each use: int list in the first case and char list in the second. O3.2. RECURSIVE FUNCTIONS 63
3.2.4 How ML Deduces Types
ML is quite good at discovering the types of variables, the types of function
parameters, and the types of values returned by functions. The subject of how
ML does so is quite complex, but there are a few observations we can make
that will cover most of the ways types are discovered. Knowing what ML can
do helps us know when we must declare a type and when we can skip type
declarations.
1. The types of the operands and result of arithmetic operators must all
agree. For example, in the expression (a+b) *2.0, we see that the right
operand of the * is a real constant, so the left operand (a+b) must also
be real. If the use of + produces a real, then both its operands are real.
‘Thus, a and b are real. They will also have a real value any other place
they are used, which can help make further type inferences.
2. When we apply an arithmetic comparison, we can be sure the operands
are of the same type, although the result is a boolean and therefore not
necessarily of the same type as the operands. For example, in the expres-
sion a<=10, we can deduce that a is an integer.
3. In a conditional expression, the expression itself and the subexpressions
following the then and else must be of the same type.
4. Ifa variable or expression used as an argument of a function is of a known
type, then the corresponding parameter of the function must be of that
type. Similarly, if the function parameter is of known type, then the
variable or expression used as the corresponding argument must be of the
same type.
5. If the expression defining the function is of a known type, then the function
returns a value of that type.
6. If no way to determine the type of a particular use of an overloaded
operator exists, then the type of that operator is defined to be the default,
for that operator, normally integer.
Example 3.13: Consider the function comb(n,m) in Fig. 3.6, which we repro-
duce here for convenience.
(1) fun comb(n,m) = (* assumes 0 <= m <= n +)
(2) if m=0 orelse m=n then 1
@) else comb(n-1,m) + comb(n-1,m-1);
In line (2), we see that in one branch of the if-then-else the result is the integer
1. Thus, the expression on line (3) must also be of type integer, and the function
comb returns an integer value. In line (3) we also see the expressions n-1 and
n-1, Since one operand of each subtraction is the integer 1, the other operands,64 CHAPTER 3. DEFINING FUNCTIONS
n in one case and m in the other, must also be integers. Thus, both parameters
of the function are integers, or strictly speaking, the (one) parameter of the
function is of type int * int, that is, a pair of integers.
Another way we could have discovered that m and n are integers is to look
at line (2). We see m compared with integer 0, som must be an integer. We also
see n compared with m, and since we already know m is an integer, we know the
same about n, O
3.2.5 Exercises for Section 3.2
Exercise 3.2.1: Write the following recursive functions.
* a) The factorial function that takes an integer n > 1 and produces the
product of all the integers from 1 up to n. Your function need not work
correctly if the argument is less than 1.
b) Given an integer i and a list L, cycle L i times. That is, if
L=[a1,02,.--54n]
then the desired result is [ai41,@i+2,---,@n,@1,@2,--.,@;]. You may use
the function cycle defined in Exercise 3.1.1(f).
* c) Duplicate each clement of a list. That is, given the list [a1,a2,...,an),
produce the list [a,a1,@2,a2,...,@n,@n)-
d) Compute the length of a list.
¢) Compute z', where x is areal and i is a nonnegative integer. This function
takes two parameters, xr and i, and need not behave well if i < 0.
*!£) Compute the largest element of list of reals. Your function need not
behave well if the list is empty.
*! Exercise 3.2.2: In the following function definition
fun foo(a,b,c,d) =
if a=b then c+1 else
if a>b then c else btd
it is possible to deduce that a, b, c, and d are all integers. Explain how ML
makes these deductions.
Exercise 3.2.3: Suppose we define a function f by a statement that begins
fun f(a:int, b, c, d, e) =...
There is a function length in the ML top-level environment that performs this function;
the exercise asks you to write the function as if it were not already available.3.3. PATTERNS IN FUNCTION DEFINITIONS 65
‘Tell what can be inferred about the types of b, ¢, d, and/or e if the body of the
function is each of the following if-then-else statements:
*a) if acbtc then d else e.
b) if acb then c else d.
*c) if acb then btc else dte.
!d) if acb then b() =
| ()
1 ve
| ()
;66 CHAPTER 3. DEFINING FUNCTIONS
The identifiers must all be the same (they are each the name of the function),
and the types of the values produced by the expressions on the right of the equal-
signs must all be the same. Likewise, the types of the patterns themselves must
be the same, but they can differ from the type of the values produced.
‘As with functions in general, the parentheses around the patterns are op-
tional. However, the juxtaposition of expressions representing application of a
function to its arguments has higher precedence than any of the usual opera-
tors. Thus it is wise to put parentheses around patterns that are more complex
than a single variable. Otherwise, we run the risk that only the first part of the
pattern will be treated as the function argument, and an error will result.
ML goes through the various patterns in the order that they appear until
it finds one that matches its argument. The first match determines the value
produced; other patterns are not considered. Thus, there can be overlap among
the various patterns.”
«@ It is also legal to fail to cover all possible cases with the forms. However,
you will get the diagnostic
Warning: match not ethaustive
You should then be very sure that the function will be used only with
arguments that match one of the patterns.
Example 3.15: Let us reconsider the function reverse from Example 3.8
There are two patterns for the argument L. If L is empty it matches the
pattern nil. If L is not empty, it will match the pattern x::xs. For instance,
if the list has a single element, x becomes that element and xs gets the value
nil. A nonempty list cannot match nil, and x::xs does not match the empty
list, because there is no head element to give a value to x. (It is not possible to
give x the value nil because x is an element, not a list). Thus, the following
definition works:
fun reverse(nil) = nil
| reverse(x::xs) = reverse(xs) @ [x]
val reverse = fn: ‘a list + ’a list
Compare this definition with the equivalent definition in Fig. 3.3.7 Here,
x plays the role of hd(L) and xs plays the role of t1(L). The above function
operates by first checking if its argument is nil and returning nil if so. If the
argument is not nil, then we can match x: :xs to the argument; x acquires the
value of the head and xs acquires the value of the tail.
However, SML/NJ, as a default, treats completely redundant patterns, that is, patterns
that can never be reached when the argument has any value that will match the pattern, as
an error.
"There is actually a small difference between the two functions we called reverse, con-
cerning the types of the elements that may form the lists being reversed. We shall address
this distinction in Section 5.3.43.3. PATTERNS IN FUNCTION DEFINITIONS 67
Names for List Components in Patterns
It is conventional to use a pair of identifiers like x for the head of a list
and xs (read “exes”) for the tail of the same list. However, beware using
a::as. Since as is a keyword in ML (see Section 3.3.2), you will get a
strange diagnostic and must find another variable to use in place of as.
added in call
to reverse(nil)
zs aad added in call
x 3 to reverse([3])
ae {3 added in call
x 2 to reverse((2,3])
ze (2,3) added in call
x 1 to reverse([1,2,3])
E {1,2,3) initial environment
Figure 3.11: Binding values to the identifiers of a pattern
Figure 3.11 suggests the addition to the environment that occurs when
reverse(L) is called, where L has the value [1,2,3]. Notice at the last call,
to reverse(nil), there are no bindings for x or xs because the pattern nil
matches the argument, and we never even try to match the second pattern. All
these additions to the environment go away when the initial call to reverse
completes.
3.3.2 “As” You Like it: Having it Both Ways
It is possible to take a single value and at one time give the value to an identifier
and match the value with a pattern. In the match, variables mentioned in the
pattern acquire their own values. The form is
as
Example 3.16: Let us write a function merge(L,M) that takes two lists of
integers, L and M, that are sorted lowest-first, and merges them. That is,
merge produces a single sorted list with all the elements of L and M. The68 CHAPTER 3. DEFINING FUNCTIONS
following recursive definition of merge works, assuming that the given lists are
sorted. Note that, although no types are specified, integer type is inferred for
all values because that is the default type for <.
BASIS: If L is empty, then the merge is M. If M is empty, the merge is L.
INDUCTION: If neither L nor M is empty, compare the heads of L and M. If
the head of L, say 2, is smaller, then the sorted list is x followed by the merge
of the tail of L with all of M. Note that in this case, x is the smallest of all the
elements, so x followed by the merge of the other elements will be the proper
sorted list.
If instead, the head of M, say y, is at least as small as x, then the merge is
y followed by the merge of L and the tail of M. Since y belongs at the head of
the result, the complete list will be sorted.
(1) fun merge(nil,M) = M
(2) | merge(L,nil) = L
(3) | merge(L as x::xs, M as y::ys) =
(4) if x int
Figure 3.14: Summing the elements of a list of lists
Line (1) of Fig. 3.14 covers the case where the list of lists is empty and the
sum is 0. Line (2) covers the case where there is a first element on the list,
but that element is itself the empty list. In this case, we can dispense with
the head and just sum the integers on the lists of the tail. Line (3) covers the
case where there is at least one element on the list that is the head of the list
of lists. We take x, which is the head of the head, and add to it the result of
applying sumLists to the list in which the element x has been removed from
the first list, but all other lists are the same. For instance, if the entire list is
([1,2], (3,411, then the recursive call’s argument is [2], (3,4]]. ©
‘As we learn about constructors and the creation of our own datatypes in
Section 6.2, we find there are many other ways to construct data structures
besides lists (which are constructed by the cons operator : :) and tuples (which
are constructed by parentheses and commas). All datatypes make patterns of
their own. However, there are some other patterns that make sense but are
illegal in ML. For example, we might expect to be able to construct patterns
using the concatenation operator @ or arithmetic operators. The next example
indicates what happens when we try to do so.
Example 3.20: We might expect to be able to break a list into the last element
and the rest of the list. For instance, we might try to compute the length of a
list by:8
fun length(nil) = 0
| length(xs@[(x]) = 1 + length(xs);
Error: non-constructor applied to argument in pattern: @
Error: unbound variable or constructor: xs
However, as we can see, the pattern xs@[x] is not legal and triggers two error
messages. The first message complains that @ is not a legal pattern constructor.
SML does provide a built-in function length that gives the length of a list. It may be
implemented by expressing a nonempty list as x::x8 and returning 1+Length(xs).72 CHAPTER 3. DEFINING FUNCTIONS
The second message is caused by the fact that, because the pattern is flawed,
variable xs does not get bound to a value. Therefore, when we encounter it
later, in the expression length (xs), ML has no value to use for xs.
Incidentally, we get. a similar pair of error messages if we try to use an
arithmetic operator to construct a pattern. For instance,
fun square(0) = 0
| square(x+1) = 1 + 2ex + square(x
is equally erroneous, even though it is based on a correct inductive definition
ofz?. O
As a final example of a nonpattern, a real constant cannot appear in pat-
terns. For instance, the following function definition
fun £(0.0) = 0
| £@) = x;
is regarded as syntactically incorrect because a real number is not permitted in
a pattern.?
3.3.5 How ML Matches Patterns
A pattern, like any expression, can be represented by a tree. The outermost,
or highest-level, operator is the root of the tree, and it has one child for each
operand. The child for an operand is, in turn, the root of a subtree for that
operand. The basis case, an expression or subexpression that is a single constant,
or variable, is represented by a node labeled by that constant or variable.
Example 3.21: Consider the pattern expression
(x:
:zs, w)
‘This expression has as outermost operator the pair-forming operator, which
we shall represent by (,). Its left operand is the subexpression x: :y::28, and
the right operand is the subexpression w. The latter is represented by a single
node labeled w. The former is grouped x: : (y::28) and is represented by a
tree with root operator ::, left child x (a single node) and right child the
root of a tree representing subexpression y::zs. The entire expression tree is
shown in Fig. 3.15(a); for the moment, ignore the curved lines connecting it to
Fig. 3.15(b).
Similarly, Fig. 3.15(b) represents the expression ([1,2,3,4], 5). The root
operator is again the pairing operator (,), and the right child of the root rep-
resents constant 5. The left operand is the list (1,2,3,4]. We build lists as
The reason for this seemingly strange restriction is that ML does not allow equality tests
between reals; see Section 2.1.4. Without such a test it is impossible to tell whether a given
real constant matches the real constant in a pattern,3.3. PATTERNS IN FUNCTION DEFINITIONS 73
/\ /\
/\ 10
Ty £\
a7 \
/\.
(a) (b)
Figure 3.15: Matching a pattern to an expression,
expressions by using the cons operator. Note that the last tail must be nil, not
the list consisting of the last element, so there are n uses of the cons operator
ina list of length n. O
To match a pattern and an expression, we overlay the pattern’s tree and
the expression’s tree, starting, as a basis step, by matching the roots. For
the inductive step, if we have matched nodes N and M of the pattern and
expression respectively, then the children of N and M must also be matched in
order.
However, sometimes a match will be impossible, and the pattern-match fails.
This situation occurs when we try to match a pattern node that is labeled by an
operator or constant, and the matching node of the expression has a different
label.
Example 3.22: If we try to match the pattern x::xs with the expression
nil, we must match operator :: with constant nil at the respective roots, and
we fail. If we try to match pattern x::y::zs with [1] (or as an expression:
1::nil), we match the roots with operators :: successfully. However, at the
right children we must match the second :: from the pattern with nil from
the expression, and thus we fail. 0
If we successfully match the pattern with the expression, then any identifiers
at. the leaves of the pattern tree match nodes that represent subexpressions.74 CHAPTER 3. DEFINING FUNCTIONS
These subexpressions become the values associated with those identifiers.
Example 3.23: Consider again Fig. 3.15. The pattern in Fig. 3.15(a) suc-
cessfully matches the expression in Fig. 3.15(b); the curved lines indicate the
correspondence of the nodes. As a result, the node labeled x in the pattern
corresponds to the node labeled 1 in the expression, so x acquires the value
1. The pattern node labeled y corresponds to expression node 2, and pattern
node zs corresponds to the expression node representing expression 3: :4: :nil,
or equivalently, the list [3,4]. Finally, the pattern node w corresponds to the
expression node 5. O
3.3.6 A Subtle Pattern Bug
Often we wish to use an identifier with a special meaning like nil in our pat-
terns. At this point we have few such special words. But beginning in Sec-
tion 6.2 we shall see that words of this type, called “data constructors,” can be
created by the programmer and used in patterns. Such a misspelled word is usu-
ally a legal identifier and looks like a pattern that matches anything. SML/NJ
treats completely redundant patterns as an error, but other ML compilers may
issue only a warning and allow the function to be used.
(1) fun reverse(niil) = nil
(2) | reverse(x::xs) = reverse(xs) @ [x];
(3) Error: match redundant
(4) nil =>
(3) 4 czas...
Figure 3.16: The reverse function with a misspelling
Example 3.24: In Fig. 3.16 is the reverse function of Example 3.15, in which
we have misspelled nil as niil on line (1). We see in lines (3) through (5) the
SML/NJ response. The system has detected the pattern niil will match any
argument, and therefore the pattern x: :xs on line (2) can never be reached. The
single arrow at the beginning of line (5) indicates which pattern is redundant.
o
3.3.7 Exercises for Section 3.3
Exercise 3.3.1: Write the following functions from previous exercises, using
two or more patterns in each.
* a) The factorial function of Exercise 3.2.1(a).
b) The function from Exercise 3.1.1(f) that cycles a list one position. If the
list is empty, return the empty list.3.3. PATTERNS IN FUNCTION DEFINITIONS 75
¢) The function from Exercise 3.2.1(b) that cycles a list i times, where i, as
well as the list, is a parameter.
* d) The function from Exercise 3.2.1(c) that duplicates each element of a list.
e) The function from Exercise 3.2.1(d) that computes 2!
*£) The function of Exercise 3.2.1(e) that computes the largest of a list of
reals.
Exercise 3.3.2: Write a function that flips alternate elements of a list. That
is, given a list [a,,a2,...,aq] as argument, produce [az,a1,a4,43,06,45,...]. If
nis odd, ay, remains at the end.
Exercise 3.3.3: Write a function that, given a list L and an integer i, returns
a copy of L with the ith element deleted. If the length of L is less than i, return
L.
Exercise 3.3.4: Show the sequence of calls to sumLists (as defined in Fig.
3.14) and the bindings to variables of patterns that occur when we call
sumLists(((1,2] ,nil,(3]])
Exercise 3.3.5: Does the pattern of Fig. 3.15(a) match the following expres-
sions? If so, give the value bindings for each of the variables x, y, zs, and
Ww.
*a) (Cra","
b) (C"a" ">I ,4.8)
* 0) ((81, (6,71)
ner] [ra te")
Exercise 3.3.6: Draw trees as in Fig. 3.15 to show how the pattern
((x,y) zs]
matches the expression [((1,2) ,3)].
* Exercise 3.3.7: There is a recursive definition of the square of a nonnegative
integer: 0? = 0 (basis), and n? = (n — 1)? + 2n — 1 (inductive step for n > 0).
Write a recursive function that computes the square of its argument using this
inductive formula.
* Exercise 3.3.8: Write a function that takes a list of pairs of integers, and
orders the elements of each pair such that the smaller number is first. Use the
as construct, so you can refer to the pair as a whole when it is not necessary
to change it.76 CHAPTER 3. DEFINING FUNCTIONS
Exercise 3.3.9: Write a function that takes a list of characters and returns
true if the first element is a vowel and false if not. Use the wildcard symbol
- whenever possible in the patterns.
11 Exercise 3.3.10: The simple rule for translating into “Pig Latin” is to take
a word that begins with a vowel and add "yay", while taking any word that
begins with one or more consonants and transferring them to the back before
appending "ay". For example, "able" becomes "ableyay" and "stripe" be-
comes "ipestray". Write a function that converts a string of letters into its
Pig-Latin translation. Hint: Use explode and the function from Exercise 3.3.9
that tests for vowels.
Exercise 3.3.11: Suppose we represent sets by lists. The members of the set.
may appear in any order on the list, but we assume that there is never more
than one occurrence of the same element on this list. Write functions to perform
the following operations on sets.
* a) member(x,S) returns true if element x is a member of set S; that is, 2
appears somewhere on the list representing S.
b) delete(x,S) deletes x from $. Remember that you may assume that x
appears at most once on the list for S.
* c) insert (x,S) puts z on the list for S if it is not already there. Remember
that in order to preserve the condition that there are no repeating elements
on a list that represents a set, we must check that x does not already
appear in S; it is not adequate simply to make x the head of the list.
Exercise 3.3.12: Write a function that takes an element a and a list L of lists
of elements of the same type as a and inserts a onto the front of each of the
lists on the list L. For example, if a = 1 and L is ([2,3), [4,5,6] nil], then
the result is ((1,2,3], (1,4,5,6], (11).
*
Exercise 3.3.13: Suppose sets are represented by lists as in Exercise 3.3.12.
The power set of a set S is the set of all subsets of S. A set of sets can
be represented in ML by a list whose elements are lists. For example, if S
is the set {1,2}, then the power set of S is {0, {1}, {2}, {1,2}}, where 0 is
the empty set. This power set can be represented in ML by the list of lists
(nil, [1], [2], [1,2]]. That is, the elements of the lists are themselves lists,
each representing one of the subsets of S. Write a function that takes a li
‘as argument, representing some set S, and produces the power set of S. Hint:
Recursively construct the power set for the tail of the list and use the function
from Exercise 3.3.12 to help construct the power set for the whole list.
*
Exercise 3.3.14: Write a function that, given list of reals (a1,a2,...,n),
computes
Tlic; (ai - a3)‘TS USING LET 77
3.4. LOCAL ENVIRONMI
That is, we compute the product of all differences between elements, with the
element appearing later on the list subtracted from the element appearing firs
If there are no pairs, the “product” is 1.0. Hint: Start by writing an auxiliary
function that, given @ and [b,,b2,.-.,Dn], computes [J]! (a — 6)
Exercise 3.3.15: Write a function to tell whether a list is empty. That is,
return true if and only if the argument is an empty list.!°
Exercise 3.3.16: Explain how ML deduces that the function sumPairs of
Example 3.18 has domain type (int * int) list.
3.4 Local Environments Using let
Sometimes we need to create some temporary values — that is, local variables —
inside a function. The proper way to do so is with a let --- in --- end
expression. A simplified form of this expression, where only val-declarations
are used, is shown in Fig. 3.17.
let
val = ;
val = ;
val =
in
end
Figure 3.17: Simple form of the “let” construct
That is, following the keyword 1et is a list of one or more val-declarations,
just like those introduced in Section 2.3.3. These are followed by the keyword
in. Following in is an expression that may use the variables defined after let.
This expression may also use any other variables accessible in the environment:
in which the function using let is defined, provided their identifiers are not
redefined by the temporary declarations between let and in. The keyword
end completes the expression. Here are a few important points to remember
about let expressions:
‘© Semicolons following the declarations are optional. We shall adopt Pascal
style and follow each but the last by a semicolon.
Just as for val-declarations in the top-level environment, don’t forget to
use the keyword val.
1OThere is a built-in ML function nul2 that does this task. We should not use this function
in the solution.78 CHAPTER 3. DEFINING FUNCTIONS
¢ We must not omit the keywords in and end, which are as essential as the
let.
« In truth, the Let expression is more general than is suggested by Fig. 3.17,
and any “declaration” can appear where we have shown val-declarations.
So far, we have not seen any other kinds of declarations besides val-
declarations and function declarations (with the keyword fun). However,
there are several others; for example, we shall meet exception declarations
in Section 5.2. The complete syntax for declarations is in Fig. 9.19.
« As another generalization, a pattern may appear in place of a single iden-
tifier in any val-declaration. Also, more than one expression may appear
after the let, although the utility of an expression list will not become
apparent until we study side-effects in Section 4.1.3.
3.4.1 Defining Common Subexpressions
One use of a let expression is to allow us to use common subexpressions. The
following example illustrates the technique.
Example 3.25: Suppose we want to compute the hundredth power of a num-
ber z. We could write the expression xex« -+- #x(100 2's) if we had the pa-
tience, but it is less tedious and less prone to error if we write the function in
Fig. 3.18.
fun hundredthPower(x:real) =
let
val four = x#x#x#x;
val twenty = four*four*four*foursfour
in
twenty*twenty*twenty*twenty*twenty
end;
val hundredthPower = fn : real -+ real
hundredthPower (2.0);
val it = 1.2675060022823E30 : real
hundredthPower(1.01);
val it = 2.70481382942153 : real
Figure 3.18: Raising a number to the 100th power
In Fig. 3.18 we define two local variables, four and twenty (no jokes about
blackbirds, please). We first define four to be x‘, and then define twenty to3.4. LOCAL ENVIRONMENTS USING LET 79
be four raised to the fifth power, or x°. Finally, we use twenty in the final
expression after the keyword in, which is twenty raised to the fifth power, or
100,
We then see two uses of this function, first computing 2%, which is about
10°, and then computing (1.01)'°°. The latter value is close to e = 2.718-+- ,
as it must be because ¢ is the limit as n goes to infinity of (1+ 1/n)". O
3.4.2 Effect on Environments of let
When we enter a let expression, an addition to the current environment is
created, adding value bindings for all the identifiers defined between the let
and the in.
twenty 1048576.0 added for
let-expression
four 16.0
added on call
x 2.0
to hundredthPower
environment before call
to hundredthPower
Figure 3.19: Additions to environment when hundredthPover is called
Example 3.26: In Fig. 3.19 we see the situation when the function of Fig. 3.18
is called. The first addition is for the function call; it is a binding for the
parameter x. The next additions are for the Let expression and include bindings
for the local variables four and twenty. We have shown x bound to the value 2.0
in the call and the local variables bound to their consequent values. As always,
when the function call returns, the additions to the environment disappear.
However, the returned value is made available as the value of the function in
the environment that results after the return. ©
Example 3.27: We can rewrite Fig. 3.18 to use x not only as the argument
of the function hundredthPower, but also as both local variables. The function
then appears as in Fig. 3.20. It behaves exactly like the function of Fig. 3.18.
However, the additional bindings in Fig. 3.21 each associate the variable x with
avalue. O80 CHAPTER 3. DEFINING FUNCTIONS
fun hundredthPower (x:real) =
let
val x = x#xK#KHX;
val x = xexexexex
in
Poerenend
end;
val hundredthPower = fn : real —> real
Figure 3.20: Repeat of Fig. 3.18 with x used for all variables
added for second
x 1048576.0 val-declaration
added for first
x 16.0 val-declaration
added on call
x 2.0
to hundredthPower
environment before call
to hundredthPower
Figure 3.21: Additions to environment corresponding to Fig. 3.20
3.4.3 Splitting Apart the Value Returned by a Function
Another important use of Let expressions is when the result of a function has
components or parts that we want to separate before we use them. In particular,
when the type of the value returned by a function is a tuple, we can get at the
components by a more general form of val-declaration than we suggested was
possible in Fig. 3.17. Instead of a single identifier following the word val, we
can have any pattern. For instance, if a function f returns a three-component
tuple, we could write
val (a,b,c) = £(.
and have the three components of the result of f bound to variables a, b, and
¢ respectively. This approach is often more convenient than writing
val x = £(...
which associates the entire tuple with x, and then extracting the individual
components with #i operators in subsequent val-declarations such as3.4. LOCAL ENVIRONMENTS USING LET 81
Patterns for Lists of Length 1
Note that the way we express “list of length 1” as a pattern is to put
square brackets around a single identifier, like Ca] in line (2) of Fig. 3.22.
Such a pattern can only match a list with a single element, and variable a
acquires that element as its value.
Another way to express “list of length 1” is with the pattern a: :ni1.
Again, a acquires the lone element as its value.
val a = #1(x);
Example 3.28: Let us implement a function split(L) that takes a list L
and splits it into two lists. One list consists of the first element, third element,
fifth element, and so on; the other list consists of the second element, fourth
clement, sixth element, and so on. This function has an important application.
In tandem with the function merge of Fig. 3.12, it lets us write a function
mergeSort that is an efficient sorter of lists. We shall cover mergeSort next,
in Section 3.4.4.
We want the function split to produce a pair of lists. The recursion consists
of two basis parts and an inductive part.
BASIS: If L is empty, then produce a pair of empty lists. If L has a single
element, the first list of the pair produced has that element and the second list
is empty.
INDUCTION: If the given list has two or more elements, let the first two ele-
ments be a and 6. Recursively split the remaining elements into a pair of lists
(M,N). The desired result is the pair of lists (a :: M, b :: N). That is, the first
list has head a and tail equal to the first of the returned lists, and the second
has head 6 and tail equal to the second of the returned lists.
‘An ML implementation of split is shown in Fig. 3.22. Line (1) implements
the first part of the basis: return a pair of empty lists in response to the empty
list. Line (2) implements the second part of the basis, where the given list has
length 1.
Lines (3) through (5) handle the inductive case. The pattern a::b::cs in
line (3) can only match a list with at least two elements; a acquires the first
element as value, b acquires the second, and cs acquires the list of the third
and subsequent elements as its value. In line (4), we apply split recursively
to the third and subsequent elements; the result is bound to the pair (M,N).
That is, M is bound to the first component of the result, which is the elements in
positions 3, 5, 7, and so on of the original list. N acquires the second component
of the return value, which is the elements in positions 4, 6, 8, and so on from
the original list.82 CHAPTER 3. DEFINING FUNCTIONS
(1) fun split(nil) = (nil,nil)
(2) | split(fa]) = ({a],nil)
(3) | split(a:: s) =
let
(4) val (M,N) = split(cs)
in
6) (a :N)
end;
val split = fn: a list + ‘a list * 'a list
split((1,2,3,4,5]);
val it = ((1,3,5],(2,4]) : int list * int list
Figure 3.22: Splitting lists
Finally, in line (5) we construct the return value for the present call to
split. The first component has head a — that is, the first element of the given
list — followed by M, the list of all the other odd-position components. Thus,
the first component is the odd-position elements in order. Similarly, the second
component b: :N is all the even-position elements. O
3.4.4 Mergesort: An Efficient, Recursive Sorter
We can combine the functions merge of Fig. 3.12 with split of Fig. 3.22 to sort
lists of integers. This algorithm is one of the simplest ways to sort n elements
in time proportional to nlogn steps. We shall not develop the analysis of this
algorithm here, but we shall complete the specification of the algorithm in ML.
The idea behind the mergesort algorithm is expressed in the following induction.
BASIS: If the given list L is empty or consists of a single element, then L is
surely sorted already, so just return L.
INDUCTION: If L has at least two elements, split L to produce the (approxi-
mately) half-size lists M and N. Recursively mergesort M and N. Then merge
the sorted lists M and N to produce the sorted version of L.
The function mergeSort is shown in Fig. 3.23. It must be preceded by
the functions merge and split to form the complete implementation of the
mergesort algorithm. Incidentally, ML discovers that mergeSort works only on
integer lists because it uses merge, which we wrote to work only for integer lists.
Lines (1) and (2) implement the basis; the remaining lines are for the induc-
tive step. Line (4) splits the given list. Lines (5) and (6) sort the half-sized lists,
and the result is produced by merging the sorted lists in line (7). Incidentally,*
3.4. LOCAL ENVIRONMENTS USING LET 83
(4) fun mergeSort (nil) = nil
(2) | mergeSort({a]) = [a]
(3) | mergeSort(L) =
let
(4) val (M,N) = split(L);
(6) val M = mergeSort (M) ;
©) val N = mergeSort(N)
in
m merge (M,N)
end
val mergeSort = fn : int list > int list
Figure 3.23: Mergesort
we could also have combined some steps by eliminating lines (5) and (6) and
replacing line (7) by merge(mergeSort (N) ,mergeSort (M)).
3.4.5 Exercises for Section 3.4
Exercise 3.4.1: Write a succinct function to compute 210.
Exercise 3.4.2: Rewrite Fig. 3.22 so line (4) does not use a pattern in the
val-declaration. That is, replace line (4) by val x = split(cs), and obtain
the components of pair x as needed.
Exercise 3.4.3: Improve upon the power-set function of Exercise 3.3.13 by
using a let expression and computing the power set of the tail only once.
Exercise 3.4.4: Improve upon the function of Exercise 3.2.1(e), to compute
the maximum of a list of reals, by using a let expression. Hint: Compute the
maximum of the tail of the list first.
Exercise 3.4.5: Write a function to compute x” for real x and nonnegative
integer i. You should make only one recursive call in your function. Hint: Note
that we can start with « and apply the squaring operation i times. For example,
when i = 3, we compute ((2?)*)?.
Exercise 3.4.6: Write a version of sumPairs of Example 3.18 that sums each
component of the pairs separately, returning a pair consisting of the sum of the
first components and the sum of the second components.
Exercise 3.4.7: Write a function that takes a list of integers as argument and
returns a pair consisting of the sum of the even positions and the sum of the
odd positions of the list. You should not use any auxiliary functions.84 CHAPTER 3. DEFINING FUNCTIONS
3.5 Case Study: Linear-Time Reverse
‘We have seen two versions of a function to reverse lists: first in Fig. 3.3 and
then in Example 3.15. These functions each seem simple enough, but they
suffer from a common flaw that they take time proportional to n to reverse
lists of length n. In comparison, a well-designed reverse function, such as the
function rev in the ML top-level environment, can reverse lists of length n in
time proportional to n. In this section, we shall see how to write a list-reverse
function that is efficient and learn a general technique for programming with
lists as we do.
3.5.1 Analysis of Simple Reverse
Let us begin by understanding why a function like that of Example 3.15 takes
time proportional to n?. The function is reproduced here for reference:
fun reverse(nil) = nil
| reverse(x::xs) = reverse(xs) @ [x];
Suppose T(n) is the time it takes reverse to work on a list of length n. We
can develop a recurrence relation, where T(n) is defined in terms of T(n — 1)
and then “solve” the equation for T(n), to get an expression for T(n) in terms
of n alone (not T)
BASIS: The basis case is when n = 0; that is, the list is empty. In this case the
first pattern, nil, matches, and nil is returned. The whole process takes only
some constant amount of time, so we shall say T(0) = @ for some constant a.
INDUCTION: Suppose n > 1. There are a number of steps that the program
will go through to process a list of length n > 1:
1. The first pattern doesn’t match, and it takes some constant amount of
time for the ML system to determine that nil doesn’t match the argu-
ment.
2. It takes another constant amount of time to match the pattern x::xs and
assign the head of the list to x and the tail to xs.
3. It takes time T(n — 1) to compute the value of reverse(xs). The reason
is that xs is surely a list of length n — 1 if x::xs is of length n.
4. To compute the return value reverse(xs) @ [x] requires that we copy
the list reverse(xs) and append the final element x as we do. This
process takes time proportional to n, the length of the resulting list.
The constant time taken by the first two steps is dominated by the linear
time taken by the last step. Thus, T(n) is approximately T(n — 1) + bn for
some constant b; the Tn — 1) represents the time for the recursive call and bn.
represents the time for the other steps. The recurrence equation is thu:3.5. CASE STUDY: LINEAR-TIME REVERSE 85
T(0)
T(n) =T(n- 1) +6n for n = 1,2,
There are several ways to solve this equation. Perhaps the simplest to to
check that T(n) = a+ bn(n + 1)/2 satisfies the equations and is therefore the
solution. Since a+ bn(n + 1)/2 is proportional to n? as n gets large, we see the
justification for our claim that reverse takes time proportional to n? on lists
of length n.
A more intuitive argument is to observe that on a list of length n, reverse
gets called recursively on lists of length n—1,n—2, and so on, down to 0. Each
call on a list of length i results in work bi for some constant 5, except the call
on a list of length 0, which results in work a. The total work is thus
at Dy bi = at bn(n +1)/2
which, as we observed, is proportional to n?.
3.5.2 ML’s Representation of Lists
A better function for reversing lists can be designed if we use the cons operator,
, instead of the concatenation operator @. It may not be obvious, but while
it takes time proportional to the length of the first list to concatenate lists, we
can cons a head and a tail in constant time. Thus, before giving the proper
design for reverse, we must understand something of how ML represents lists
internally.
Lists are represented in a conventional, linked list fashion, as suggested by
Fig. 3.24. Cells consist of a pair of pointers, the first to an element of the list
and the second to the next cell. If a list is bound to a variable L, then in the
ML environment there is an entry in which the identifier L is associated with a
pointer to the first cell of the list.
+L +
Figure 3.24: Representing a linked list
Suppose we wish to construct the list x: :xs given values of head x and tail
xs. We have only to create a new cell C. The first pointer of C points to the
head of the list, that is, to the value of x, and the second pointer in C points86 CHAPTER 3. DEFINING FUNCTIONS
to the value of xs. In this way, C becomes the first cell on the linked list that
represents the value of x::xs. The process is suggested by Fig. 3.25. Notice
that creating cell C and setting its pointers to refer to the values of x and xs
takes a constant amount of time, independent of how big the value of x or xs
is. There is no need for ML to “look inside” the values of the head or tail.
Similarly, we can invert this process. In constant time we can find the head and
tail of a list. No copying is necessary.
Figure 3.25: Applying the cons operator
3.5.3 A Reversal Function Using Difference Lists
There is a trick known to LISP programmers as difference lists, in which one
manipulates lists more efficiently by keeping, as an extra parameter of your
function, a list that represents in some way what you have already accomplished.
The idea comes up in a number of different applications; we hope that seeing
it used to reverse lists will illustrate the technique sufficiently that its use will
be apparent when you need it.3.5. CASE STUDY: LINEAR-TIME REVERSE ca
We design an auxiliary function rev1(L,M) whose job is to return LM,
that is, the reverse of list L followed by the list M (not reversed). Note that
we use the superscript R as a convenient way to indicate the reverse of a list.
If we wish to reverse list L, we have only to call revi(L,nil). The result is
LF concatenated with the empty list, which is just L®.
(1) fun revi(nil, M) =
(2) | revi(x::xs, ys) = revi(xs, x::ys);
val revl = fn: ‘a list * ’a list > ’a list
fun reverse(L) = revi(L,nil);
val reverse = fn: ‘a list + 'a list
Figure 3.26: List reversal using difference lists
Figure 3.26 shows the function rev1 and its use to define a linear-time list-
reversal function. Line (1) of rev1 handles the basis case, when there is nothing
left to reverse. Then, the result is just a copy of the second argument.
Line (2) handles the inductive case, where we need to reverse a list of one or
more elements. We move the head of the list we need to reverse to the beginning
of the list that is not to be reversed. We then call revi recursively on the new
pair of lists. Eventually, all the elements of the first list are moved to the front
of the second list, in reverse order. At that point, the basis case applies and
the recursion ends.
To see why the technique works, suppose we call
revi ([a1,a2,...,@n], [b1,b2,-.-,bm})
Then the desired output is the list [an,an—1,-..5@1,01,b2,.-
move the head of the first list to the second, we call
bm). When we
»bm])
‘The result of this call is [an,an—1,.--,2] followed by the element a1, followed
by [b1,b2,..-45m], which is the result desired for the original call to revi.
revi({az,03,-.-sdn), [arsbis br
Example 3.29: Here is the sequence of calls that results when we try to reverse
the list [1,2,3]:
reverse((1,2,3])
revi([1,2,3], nil)
revi((2,3], [1])
revi((3], (2,1)
revi(nil, [3,2,1])
At this point, the basis applies, and the result [3,2,1] is produced. O88 CHAPTER 3. DEFINING FUNCTIONS
3.5.4 Analysis of Fast Reverse
We can argue that the reversal program of Fig. 3.26 takes time proportional to
the length of the list, as follows.
1. Function revi calls itself with a first argument that is shorter by 1 than
its parameter, so with a first argument of length n, revi makes n recursive
calls.
2. Each recursive call to revi takes a constant amount of time to break apart
a head and tail and then cons a head and tail, until we get to the basis
case at line (1) of Fig. 3.26. The basis case also takes constant time.
3. Thus, revi takes time proportional to the length of its first argument.
4. The time taken by reverse on a list of length n is essentially the time
taken by revi when given a first argument of length n. Thus, reverse
takes time proportional to the length of the list it is to reverse.
3.5.5 Exercises for Section 3.5
Exercise 3.5.1: Write a function cat(L,M) that produces the concatenation
LON of the lists L and M. However, your function should not use the @ operator;
only the cons operator :: should be used. Your function must run in time
proportional to the length of L, independent of M.
Exercise 3.5.2: Write a function cycle(L, i) that cycles list L by i positions,
as in Exercise 3.2.1(b). However, your function must take time proportional to
the length of L (which we assume is at least i). Hint: You need to break this
function into a sequence of steps performed by auxiliary functions.
3.6 Case Study: Polynomial Multiplication
In this section we shall show one useful way to represent polynomials in a single
variable. We shall consider ways to perform polynomial multiplication, which is
also the important signal-processing operation known as convolution. We begin
with some simple functions that get the job done, but take time proportional to
n? to multiply polynomials of length n. Then, we exhibit a more complicated
algorithm that multiplies polynomials in time proportional to n'5®. We do not
show the algorithm that is asymptotically most efficient — the “Fast Fourier
Transform” approach. That algorithm takes time proportional to nlogn.'!
jee Aho, Hoperoft, and Ullman, Design and Analysis of Computer Algorithms, Addison-
Wesley, 1974, for a discussion of efficient polynomial multiplication, including both the FFT.
and the Karatsuba-Ofman approach discussed in Section 3.6.5.3.6. CASE STUDY: POLYNOMIAL MULTIPLICATION 89
3.6.1 Representing Polynomials by Lists
We shall use lists of reals to represent polynomials by their coefficients, lowest
degree first. For instance, the polynomial 2* + 4x — 5 is represented by the list
(75.0, 4.0, 0.0, 1.0]. In general, the polynomial 57”_, a:z! is represented
by the list of n + 1 elements [ao,@1,..-,@,]. Conventionally, we shall take the
empty list to represent the polynomial 0, but this polynomial also has other
representations such as [0.0] and [0.0, 0.0].
An important observation is that if L is a list representing polynomial P,
and L is of the form a :: M (that is, L has head a and tail M), and the tail
represents polynomial Q, then P = a+Qcz. That is, multiplication by z in effect
shifts the elements of the corresponding list one position right. For instance, if
P=n+4r-5
then we observed that the representing list is [“8.0, 4.0, 0.0, 1.0]. Thus,
ais ~5.0 and M is [4.0, 0.0, 1.0]. M represents the polynomial Q = 2? +4.
Note that P =a +Qz, that is, P= —5 + (2? +4)x.
3.6.2 A Simple Polynomial-Multiplication Algorithm
In Fig. 3.27 we see three functions that perform common operations on polyno-
mials in this representation. The first, padd(P,Q), adds polynomials P and Q.
We recursively define the sum of two lists P and Q that represent polynomials
by:
BASIS: If either P or @ is the empty list, then the sum is the other. Note that
if both are empty, the result is the polynomial 0 represented by the empty list.
INDUCTION: For the induction, assume that neither list is empty. Suppose
P has head p and a tail representing polynomial R, while Q has head q and
a tail representing polynomial S. Then the sum P + Q is the list with head
element p+q and tail equal to the result of applying padd to the two tails. The
correctness of this rule is seen as follows. If P = p+ Rr and Q = q+ Sz, then
P+Q=(p+q)+(R+S)x
In line (1) of Fig. 3.27 we see one part of the basis. Whenever the second
polynomial is the empty list, the result is the first polynomial. Line (2) handles
the other part of the basis. If the first polynomial is empty, the result is the
second.
If neither of the first two patterns match the arguments, then it must be that
both polynomials are nonempty lists. Thus, in line (3) the pattern p: :ps is sure
to match the first argument, and q: :qs will surely match the second argument.
Notice we have attached type real to the variable p of this pattern. That is
enough for ML to figure out the type of all variables and to disambiguate the
use of + in line (3).90 CHAPTER 3. DEFINING FUNCTIONS
As a result of the match, p acquires the value of the first element of the
first polynomial, and q acquires the value of the first element of the second
polynomial. Their sum becomes the first element of the result, and padd is
applied to the tails to get the tail of the result.
(* padd(P,Q) produces the polynomial sum P+Q *)
(1) fun padd(P,nil) = P
(2) | padd(ni1,Q) = Q
(3) | padd((p:real)::ps, q::q8) = (p+q)::padd(ps,qs);
(* smult(P,q) multiplies polynomial P by scalar q *)
(4) fun smult(nil,g) = nil
(5) | smult((p:real)::ps,q) = (p+q): :smult(ps,q);
(* pmult(P,Q) produces PQ *)
(6) fun pmult(P,nil) = nil
(7) | pmult(P,q::qs) = padd(smult(P,q), 0.0::pmult(P,qs));
Figure 3.27: Polynomial addition and multiplication
In lines (4) and (5) of Fig. 3.27 we see the function sult that multiplies a
polynomial P by a scalar q. That is, each term in the polynomial is multiplied
by g. The recursive definition of this operation is:
BASIS: If P is empty, then the product is the empty list representing 0
INDUCTION: If P has head p, then the head of the result is pg. The tail of the
result is found by recursively applying smult to the tail of P and the scalar q.
Line (4) handles the basis and line (5) handles the inductive step. The
justification for this algorithm is that if P = p+ Re, then Pq = pq + Rar.
Now let us consider the function pmalt of lines (6) and (7) of Fig. 3.27. This
function multiplies polynomials P and Q using a recursion on the length of the
second polynomial.
BASIS: If the second polynomial is empty, then the result is empty.
INDUCTION: If the second polynomial Q can be written as q + Sz, then
PQ = Pq+PSr
The product Pq is a scalar multiplication. PS is a recursive application of the
polynomial multiplication with a smaller second argument.3.6. CASE STUDY: POLYNOMIAL MULTIPLICATION 91
The basis is implemented by line (6). In line (7) we see the inductive ste
smult (P,q) produces Pg, while pmult (P ,qs) produces the polynomial product
we called PS in the inductive formula above. To multiply this product by ..
(p :: psn) >...
‘The compiler has correctly pointed out that we have made an assumption
about the relationship between the arguments P and n of carve(P,n): n
will never be greater than the length of P. Thus, the first pattern looks
for n = 0, and the second pattern assumes that if n > 0, then P must
not be nil. If we were to call carve(nil,n), where n > 0, then neither
pattern would match and the function would fail. Fortunately, when we
use carve in the Karatsuba-Ofman algorithm, our assumptions are certain
to be met. However:
It is generally a bad practice to write functions whose patterns do
not cover all possible cases, even cases for which the function was
not intended.
of Section 3.6.3, the shifts and additions take time that is linear in the size of
the polynomials. That is, we can write a recurrence equation for T(n) the time
it takes to multiply polynomials of length n using Formula 3.1 directly:
T(n) = 4T(n/2) + bn
‘The solution to this equation is
T(n) = (a+ b)n? — bn
<< ;——>
O= v Ww
Figure 3.29: Breaking polynomials into half-sized nincas96 CHAPTER 3. DEFINING FUNCTIONS
That is, T(n) is proportional to n2, exactly as for the straightforward polyno-
mial multiplication method.
To design a faster algorithm, we need to reduce the number of times we mul-
tiply half-sized polynomials. We can do so even at the expense of an increased
number of operations that take time linear in the size of the polynomials, such
as adding or subtracting polynomials, “shifting” (multiplying by a power of x),
or “carving” polynomials into two.
We can reduce the number of half-sized multiplications to three if we com-
pute TV and UW as in Formula 3.1, but write the middle term as:
TW +UV =(T+U)(V+W)-TV-UW (3.2)
Since TV and UW are already computed, Formula 3.2 uses only one additional
half-sized multiplication, (T-+U) times (V +1), rather than the two additional
multiplications needed if we computed TW +UV directly. Notice that the fact
Formula 3.2 uses two additions and two subtractions in place of a single addition
is not a real problem. Intuitively, multiplication takes time that grows faster
than linear in n, so the cost of the multiplications swamps out the cost of the
additions for large n.
(* komult (P,Q) computes the product of polynomials PQ using
the Karatsuba-Ofman method that only calls itself three
times rather than four on half-sized polynomials. +)
(1) fun komult(P,nil) = nil
(2) | komult(nil,Q) = nil
(3) | komult(P, {q]) = smuit(P,q)
(4) | komult({p],Q) = smult(Q,p)
(5) | komult(P,Q) =
let
(6) val n = length(P);
a val m = length(Q);
(8) val s = bestSplit(n,m);
(9) val (T,U) = carve(P,s);
(10) val (V,W) = carve(Q,s);
(11) val TV = komult(T,V:
(12) val UW = komult(U,W);
(13) val TUVW = komult(padd(T,U), padd(V,W));
(a4) val middle = psub(psub(TUVW,TV), UW);
in
(15) padd(padd(TV,shift(middle,s)), shift (UW,2*s))
end;
Figure 3.30: The Karatsuba-Ofman multiplication algorithm3.6. CASE STUDY: POLYNOMIAL MULTIPLICATION 97
Figure 3.30 implements this idea in a recursive ML function. Lines (1) and
(2) handle the basis cases where one of the polynomials is the empty list. In
these cases, the empty list is returned. Lines (3) and (4) handle additional
bases cases where one of the polynomials is of length 1. Such a polynomial
is a constant, so we can use the linear-time scalar multiplication algorithm to
handles these cases.
Line (5) begins the inductive case. We use each of the auxiliary functions
from Fig. 3.28 at least once in a sequence of val-declarations. Lines (6) and (7)
compute the lengths of the two polynomials, and line (8) picks the value of s
using the bestSplit function. The role of s, the length of the low-order pieces
T and V, was illustrated in Fig. 3.29.
Then, lines (9) and (10) divide the two polynomials into low-order and
high-order pieces, as suggested by Fig. 3.29. Line (11) computes the first half-
sized product, TW, and line (12) computes the second: UW’. Lines (13) and
(14) implement the expression of Formula 3.2. That is, line (13) computes
(7 +U)(V +W), and line (14) subtracts from this expression the terms TV
and UW.
Finally, the result of the function is computed in line (15). This expression
implements Formula 3.1. However, the middle term, TV + UW, has been
computed by Formula 3.2, rather than directly.
3.6.6 Analysis of the Karatsuba-Ofman Algorithm
We can show that the dominant cost of the algorithm of Fig. 3.30 is the three
half-sized multiplications. Let T(n) be the running time of this function on two
polynomials of length n. For the basis, where n = 1, one of the basis cases of
lines (3) or (4) applies. The running time is thus some constant, say T(1) = a.
For the induction, let n > 1. Then the inductive case starting at line (5)
applies. The following is a list of the running times for each of steps (6) through
(15):
6: Proportional ton.
7: Proportional to n.
8: Constant.
9: Proportional to n.
10: Proportional to n.
11: T(n/2).
12: T(n/2).
13: A term proportional to n for the calls to padd plus T(n/2) for the call to
komult.98 CHAPTER 3. DEFINING FUNCTIONS
14: Proportional to n.
15: Proportional to n.
‘The sum of the times is thus 37'(n/2) plus a term that is proportional to n. We
may write the recurrence equation as:
T(1)=a
T(n) = 37 (n/2) +n
The solution to this equation is
T(n) = (a + 26)n'823 — 2bn
as you may check by substitution in both equations. Thus, the running time of
the Karatsuba-Ofman algorithm is proportional to n!°823, or n!-®°, significantly
less than the n? of more straightforward algorithms.
3.6.7 Exercises for Section 3.6
Exercise 3.6.1: Write a function genPoly(n) that generates a polynomial of
length n (degree n — 1), all of whose coefficients are 1.0. Measure the running
time of the straightforward algorithm pmult of Fig. 3.27 and the algorithm of
Fig. 3.30 with its attendant auxiliaries from Figs. 3.27 and 3.28. The code can
be downloaded from the book's web site. Consider polynomials of length n
ranging from 1 to about 1000, generated by genPoly. For what value of n does
the running time of komult drop below the running time of pmult?
Exercise 3.6.2: One problem with komult is that for small n it wastes time,
compared with the straightforward approach to polynomial multiplication. Re-
write Komult so it calls pmult to multiply polynomials whose length is below
some limit. Experiment with running times as in Exercise 3.6.1 to find the limit
below which it makes sense to use pmult, and adjust your function accordingly.
! Exercise 3.6.3: Write a function to evaluate a polynomial at a given real
value a. That is, define a function eval (P,a) that takes a list (polynomial) P
and a real number a, and computes P(a).
*! Exercise 3.6.4: Given a list of reals [a1,a2,..., dn], find the polynomial whose
Toots are a1,02,-..,@n. Hint: Note that this polynomial is the product of
(x —a;) for i= 1,2,....n.
1! Exercise 3.6.5: We can represent polynomials in two variables, x and y, by
a list of lists. Think of such a polynomial as a polynomial in 2, whose coef-
ficients, instead of being real numbers, are polynomials in y. Represent these
polynomials in y by lists as we did in Section 3.6.1. Then use the lists repre-
senting these polynomials as the elements of a list representing the polynomial3.6. CASE STUDY: POLYNOMIAL MULTIPLICATION 99
in z. For example, the polynomial 1+ 2zy + 3zy? + 42%y can we written as
1+ (2y + 3y)x + (4y)z%. The polynomial 2y + 3y? is represented by the list
{0.0, 2.0, 3.0] and the polynomial 4y is represented by [0.0, 4.0]. Thus,
the polynomial in x can be written
([1.0], [0.0,2.0,3.0], (1, [0.0,4.0]]
Write functions to add polynomials in two variables, scalar-multiply such poly-
nomials, and polynomial-multiply these polynomials. You need not use a
“Karatsuba-Ofman” type trick to improve efficiency.Chapter 4
Input and Output
In this chapter we shall learn how to read and write information from files. ML
offers us a number of tools, ranging from a simple function that prints strings
to the standard output to more complex functions that perform UNIX-style
input/output and more.
Our study of input and output forces us to learn a number of additional
features of ML. In this chapter we shall find discussions of the following topics
in addition to input/output:
1. The unit type, which is similar to “void” in C.
2. The type constructor option, which allows us to express values that are
either present or absent.
3. Lists of statements.
4. A way to access functions that are in the standard basis of ML but not
in the top-level environment.
4.1 Simple Output
ML provides a print operator that writes a character string to the standard
output. This function is simple to use and can do most of what we need for
typical output operations. Thus, it is a good point to begin our study of 1/0.
4.1.1 The Print Function
The expression print (x) causes the value of a character string x to be printed
on the “standard output,” which would be the terminal unless you have called
SML/NJ with another standard output designated (via the UNIX > operator).
The value returned by the print function is the unit (). This symbol, which102 CHAPTER 4. INPUT AND OUTPUT
we have not seen before, is the lone value of the type unit, which we have also
not encountered previously.
One purpose of the unit is to serve as the value returned by a function, such
as print, that does its work by a side-effect. Notice that unlike all expressions of
ML encountered so far, print has an effect on more than the ML environment;
it changes the external world: either what appears on the user’s terminal or the
contents of the file that is the current standard output.
© Note that print does not return the value printed as its own value.
Example 4.1: In Fig. 4.1 is a function called testZero, which tests whether or
not its integer argument is 0 and prints one of the strings "zero" or "not zero"
as appropriate. Notice that ML responds by saying that testZero is a function
from the type integer to the type unit, because the unit is the “value” produced
by the print function. The fact that a string is produced as a side-effect is not
reflected in the type of the function.
fun testZero(0) = print("zero\n")
| testZero(_) = print ("not zero\n'
val testZero = fn: int + unit
testZero(2);
not zero
val it = () : unit
Figure 4.1: A function that uses the print function
We also see in Fig. 4.1 a use of testZero(2) and ML’s response. We first see
the printed response not zero on the standard output. Following immediately
is the normal response of ML after evaluating a function:
val it = () : unit
Notice that the value of the expression testZero(2) is the unit (). That is
what print returns, and therefore that is what testZero returns.
Remember from Section 2.1.1 that in strings we can use the sequence \n
to represent a newline. Had we omitted printing this character in the
print statements of Fig. 4.1, the output would have run together, as
not zeroval it = () : unit4.1. SIMPLE OUTPUT 103
The Type unit
The unit type is another of the basic types of the ML system, like int. In
a sense it is like the C type void. However, while there is no value for a
“void” in C, the ML unit type has exactly one value, ()
‘The unit appears in a surprising place in ML: as the argument of
a seemingly zero-argument function. Thus, if we were to write a zero-
argument function that when called returns the string hello world, it
would appear as follows:
fun hello() = "hello world"
val hello = fn : unit + string
‘That is, function hello has an argument after all; the unit. It would be
called by applying it to the unit, as:
hello();
val it = “hello world” : string
4.1.2 Printing Nonstring Values
It is possible to print values other than strings if we first convert the value to
a string. For example, we learned in Section 2.24 that the function str will
change a character into a string of length 1. Thus, we could write
val c = #"a";
print (str(e));
to print an a on the standard output.
However, printing characters as strings is not very interesting. More often,
we would like to print integers or real numbers, or perhaps values of other
types. There is a function toString associated with integers, reals, and some
other types that converts values of those types to appropriate character strings.
The identifier toString denotes one of several rather different functions, and in
order to tell which one is meant, it is necessary to prefix the identifier toString
by the name of the “structure” to which it belongs, and a dot. We shall take
up structures, both user-defined structures and structures provided by ML,
in Section 8.2. However, roughly, for each type there is a structure with the
same name but with the first letter capitalized. For example, the structures Int,
Real, and Bool are associated with the types int, real, and bool, respectively.
Example 4.2: Here is an example of printing the value of a real number as @
string:104 CHAPTER 4. INPUT AND OUTPUT.
val x = 1.0E50;
val x = 1e50 : real
print (Real.toString(x));
Le50val it = () : unit
Notice that ML selects a representation for our chosen real number that is
equivalent to, but not exactly the way we typed it. Also observe that there is
no newline character printed, so the standard “val =” response of ML follows
the printed value on the same line, with no intervening space.
Similarly, we can print integers or booleans by
print (Int.toString(123)) ;
123val it = () : unit
print (Bool.toString(true));
trueval it = () : unit
In each of these cases, we have used the integer or boolean value as an argument
directly, rather than “assigning” the value to a variable. In practice, there would
be little point in writing an integer or boolean and converting it to a string,
rather than printing the resulting string directly, provided we knew the value
to be printed in advance. We would only use these conversion functions if we
wished to print some value that was calculated at run time. O
4.1.3 “Statement” Lists
It is often useful to execute a sequence of two or more “statements” with side-
effects, such as print expressions.! The syntax for doing so in ML is
(first expression> ; --- ;)
That is, a list of expressions is separated by semicolons and surrounded by
parentheses. The construct is like begin --- end in Pascal or { --- } in C.
Each expression is evaluated in turn. However, the list of expressions is itself
an expression and produces a value. The value produced by a list of expressions
is the value produced by the last of the expressions.
"Technically, there is no such thing as a “statement” in ML, only expressions. However,
expressions that cause side-effects behave much like statements of ordinary languages. We
shall informally refer to them as statements.4.1. SIMPLE OUTPUT 105
Example 4.3: The function printList in Fig. 4.2 prints each element of an
integer list in order and in a vertical column. Line (1) handles the case where
the list is empty. Nothing is printed, and the unit is returned. Note that we do
not care what printList returns since, unlike most ML functions, printList
does its job by its side-effects, not by its returned value. However, like all
functions, printList must return one type of value, unit in this case.
(1) fun printList(nil) = (
(2) | printList(x::xs) = (
(3) print (Int. toString(x));
«@) print("\n");
(5) printList (xs)
ds
val printList = fn: int list + unit
printList((1,2,3]
1
2
8
val it = () : unit
Figure 4.2: Printing a list as a side-effect
Lines (2) through (5) handle the case where the list is not empty. We see,
beginning at line (3), a sequence of three expressions, each of which causes a
printing side-effect. Line (3) prints the head element of the list. Since the
function Int. toString is applied to the head element z, this element must be
an integer. Line (4) prints the newline character, thus skipping to the next line
of output. Line (5) is a recursive call to printList on the tail of the list. That
expression causes the rest of the list to be printed and returns the unit. The
unit thus becomes the value of the list of statements and the value returned
by the function. The ML response confirms that printList is a function that
takes an integer list as argument and returns a unit.
In Fig. 4.2, this function is used to print the list (1,2,3]. The initial call
to printList({1,2,3]) prints 1, then prints a newline (i.e., it skips to the
next output line), and last calls printList recursively on the tail [2,3]. That
call results in the printing of the elements 2 and 3 on separate lines. Finally,
since printList ([1,2,3]) is an expression, ML responds with the value of this
expression, which is the unit.106 CHAPTER 4. INPUT AND OUTPUT.
Opening Structures
It is possible to give an identifier such as toString that is defined within
one or more structures its meaning within that structure, without requiring
that the structure name be prefixed. To do so, we open the structure, for
example,
open Int;
to open the structure for integers. ML responds with a long list of functions
that are available in this structure. The names of these functions are now
part of the ML environment and can be referred to without prefixing Int.
to those names. For example, print (toString (123)) is now legal. How-
ever, the toString functions from other structures still must be referred
to with their structure name as a prefix; e.g., print (toString (12.34))
is illegal.
Figure 1.1, which we repeat here as Fig. 4.3, suggests the situation
regarding structures. The entire content of all these structures is called
the standard basis. The top-level environment of ML, which is available
when the ML compiler is invoked, gives the user certain functions from
various structures, such as the operators discussed in Sections 2.1 and 2.2.
Other functions of the standard basis are available “below the surface,” if
we refer to these functions by a long identifier that includes the structure
name. However, an entire structure can be “raised” to the surface by the
open command,
4.1.4 Statement Lists Versus Let-Expressions
You may have noticed a similarity between the list of statements mentioned
above and the let-expression from Section 3.4. Each involves a sequence of steps
that are evaluated or executed in turn, and the result is the value returned by
the last expression. However, different kinds of expressions are allowed between
the let and in keywords than are allowed in statement lists or are allowed
between the in and end.
Between let and in we must find declarations such as val-declarations,
function definitions, and a few more kinds of declarations that we shall learn
later (see Fig, 9.19 for a summary of declarations). Intuitively declarations are
the kinds of expressions that evoke a response other than val it = --- when you
type them in the top-level environment. For example, they may result in ML
telling the value of some identifier other than it.
On the other hand, the “ordinary” expressions that can appear in an ex-
pression list (or after the in of a let-expression) are characterized by an ML
response in which the identifier it has its value told. See Fig. 9.13 for the com-4.1. SIMPLE OUTPUT 107
Top-level
Environment
Int Real String 000
Structures
Figure 4.3: Parts of structures of the standard basis appear in the top-level
environment; the remainders are also available
plete structure of expressions. Another way to look at the distinction is that
let-expressions make significant alterations to the environment through their
declarations, while expressions leave the environment unchanged. Note that
side-effects such as printing or reading input do not change the environment as
far as ML is concerned, although they do change the state of the surrounding
file system
# Although we have not yet seen an example, an expression list can appear
between the in and end in a let-expression, in place of a single expression.
The surrounding parentheses are unnecessary in such a list.
4.1.5 Exercises for Section 4.1
Exercise 4.1.1: If we were to change line (1) of Fig. 4.2 to
fun printList(nil) = 0
and leave the rest of printList as it is, would there be a type error in the
function printList?
Exercise 4.1.2: Write the function comb, computing (/,), in such a way that
when we call comb(n,m) it prints n and m before printing the result. Print out
suitable words so n, m, and ("") are clearly distinguishable from one another.
Exercise 4.1.3: Write a function that, given integer n, prints 2" X"s, using n
recursive calls. Hint: Compute the desired string recursively before printing it.
Exercise 4.1.4: Write a function that, given n, prints 2” X’s using only logy n
recursive calls.108 CHAPTER 4. INPUT AND OUTPUT
Should We Open Structures?
There is a school of thought that one should never open a structure. One
reason is that it is hard to remember exactly which functions are brought
into the current environment when a given structure is opened. Should
we open a structure, we might absent-mindedly replace some function we
needed by a function of the same name in the structure, with unpredictable
results. However, for succinctness, in the balance of this chapter we shall
asume that the Text IO structure is open.
4.2 Reading Input From a File
ML's approach to reading and writing files will be familiar if you have used
UNIX file reading and writing commands from C or another language. In this,
section we shall cover file reading, while writing files is treated in Section 4.3.
4.2.1 Instreams
‘The ML type instream is a source of characters that may be read by certain
functions that we are about to learn. If you are familiar with file input/output
in UNIX, you will see an analogy between an instream and the file-ID of a file
that is opened for reading. The necessary functions are found in a structure
called Text 10. If we wish, we may issue the command:
open TextI0;
ML responds with a list of the functions available in the Text IO structure. We
may now refer to these functions without prefixing them with TextIO and a
dot. Had we not opened the TextI0 structure, we could still use its functions
if we prefixed them by Text 10 and the dot.
Our first step in reading a file is to obtain a value of type instream that is
associated with the file, opened for reading. We issue the command:
openIn("")
This expression causes the file named by the quoted string to be opened for
reading. It returns a “token,” or internal value, of type instream. This token
must be used to read from the file in the future.
Example 4.4:
val infile = openIn("foo");
val infile = — : instream4.2. READING INPUT FROM A FILE 109
opens the file named foo in the directory in which the ML program is running.
To read from the file subsequently, we refer to it by the identifier infile, not
foo.
The argument to openIn may be any legal path name. For instance, the
expression
val infile2 = openIn("/usr/spool/mail/ullman
val infile2 = — : instream
opens the author’s mail file for reading. In the future, this file must be referred
to by the identifier infile2. O
4.2.2 Reading Characters From a File
Once we have opened the file, we can read characters from it. There are several
different functions that can read different amounts of the opened file. Let us
begin by using a simple approach involving the following two functions:
1. endOfStream() is a predicate (function that returns a boolean)
that tells whether we are at the end of the file. The value of
must be an instream, that is, a token returned by openIn. Function
endOfStream returns true if this instream currently has no input waiting.
If the instream refers to a standard file, then the end-of-file has been
reached. However, if the instream is a special file, such as a terminal,
then endO0fStream might return true at one time and later return false
when more input had been typed at the terminal.
2. inputN( ,n) reads the next n characters from the file named .
What is read is returned as a string. Again, is an instream or
internal token used to designate an opened file. If there are fewer than
n characters remaining in the file, then only what remains is read, and
fewer than n characters are returned. However, if the “file” is actually
a source like an input terminal, then input will wait until n characters
appear or an explicit end of file is seen. There will be an indefinite wait
if these characters never arrive.
Example 4.5: Figure 4.4 shows a function readList that opens a file, reads it
character-by-character, and returns the list whose elements are the characters
of the file, in order. Each element is a string of length 1.
Line (2) says that if we have reached the end of the file, then we return the
empty list. Line (3) handles the case where the file is not empty. We use the
inputN function to read a string of length one from the file, and this string
becomes the head of the list being formed. The tail of the list is constructed
by a recursive call to the function readList. ML deduces that infile must
be of type instream, because function inputN requires that type for its first
argument.110 CHAPTER 4. INPUT AND OUTPUT
(1) fun readList(infile) =
(2) if end0fStream(infile) then nil
@) else inputN(infile,1) :: readList(infile);
val readList = fn : instream — string list
(4) readList (openIn("test"));
val it = [71",2","\n","a”,"”,"\n"] + string list
Figure 4.4: Reading a file and turning it into a list of characters
We also see in line (4) of Fig. 4.4 the application of readList to a file
named test that holds the six characters shown in Fig. 4.5. Note that two of
the characters in the file are newline characters, one after each line.
12
ab
Figure 4.5: The file test
In the expression readList (openIn("test")) of line (4), openIn opens
the file test and produces a token of type instream representing this file. We
never see the value of this token, but it immediately becomes the argument of
readList; that is, the parameter infile of readList gets this instream as its
value for the call.
Finally, we see the response of ML to this expression; it ascribes to identifier
it the list of characters of the file test. Note that we have, in effect, “dropped
the list on the floor.” In a more realistic example, we would pass the list as an
argument to another function, which would perform some useful computation
on the list. O
4.2.3. Reading Lines of a File
The function inputLine() is applied to an instream. It returns a string
consisting of the characters on the first unread line of the instream, that is, all
characters up to and including the next newline character. If there are one or
more characters remaining on the instream, but there is no newline character,
then the characters up to the end of file are returned with a newline character
appended.
Example 4.6: Suppose we open the file of Fig. 4.5 with:
val infile = openIn("test4.2. READING INPUT FROM A FILE bb
In Fig. 4.6 are three consecutive uses of the inputLine function; in each case
the string returned is “thrown on the floor.”
inputLine(infile) ;
val it = "12\n" : string
inputLine(infile) ;
val it = "ab\n” : string
inputLine(infile) ;
val it =" : string
Figure 4.6: Uses of inputLine
Note that the two lines of the file are returned by the first two calls, and
since the end of file has then been reached, the third call returns the empty
string, not a newline. However, had the final newline been omitted, and the file
test composed of the five characters 12\nab, the responses would have been
exactly the same as Fig. 4.6, including the newline character following b in the
second use of inputLine. O
4.2.4 Reading Complete Files
We can read an entire text file with the function input (). Again, the
file is an instream. The result of this call is a character string consisting of all
the remaining characters of the instream.
Example 4.7: If we open the file from Fig. 4.5 as we did in Example 4.6, then
val s = input(infile);
binds s to the string "12\nab\n". Among other possibilities, we could apply
the explode function to s, producing a list of characters. We could then process
this list character by character in a variety of ways. O
4.2.5 Reading a Single Character
Yet another function to read text files is input1 (). This function reads
from the instream a single character. However, the value is returned in
a form we have not seen before, involving the type constructor option. That
is, the type of the value returned by input is char option, whose values are
either of the form112 CHAPTER 4. INPUT AND OUTPUT
1. SOME c, where ¢ is a character, or
2. NONE.
IfSOME cis returned by input, then cis the character read. If NONE is returned,
then no character remained on the instream.
Example 4.8: Suppose infile is an instream opened for reading the file test
of Fig. 4.5. Here is the first call to input on this instream.
input1(infile) ;
val it = SOME
"1” : char option
Notice that the first character of the file is returned as an option; its type is
char option. The next five calls return SOME #"2", SOME #"\n", SOME #"a",
SOME #"b", and SOME #"\n". The seventh and subsequent calls return NONE.
o
The fact that the character is returned as an option is actually convenient,
since it lets us gracefully handle the situation where there are no more characters
on the instream, We use SOME and NONE in patterns, thereby distinguishing the
end of file without having to use end0fStream to test the end explicitly.
Example 4.9: In Fig. 4.7 we sce another way to read a file and turn it into a
list of characters. The work is done by the auxiliary function makeList1, which
takes an instream called infile and a character option and returns the list
consisting of the character in the option (if there is one) and all the characters
remaining on the instream
fun makeList1(infile, NONE) = nil
|| makeListi(infile, SOME c) =
¢ i: makeListi(infile, inputi(infile));
val makeList! = fn : instream * char option > char list
fun makeList(infile) = makeListi(infile, inputi(infile));
val makeList = fn : instream -+ char list
Figure 4.7: Converting a file into a list of characters
The first line of makeList1 handles the case where the option is NONE. This
case occurs when there were no more characters on the instream when we tried
previously to read the instream using input1. Since we have no more characters
available, we return the empty list.4.2, READING INPUT FROM A FILE mn
Some Operators for Options
There are two useful operators on options. Given a value x whose type is
T option for some type T, we can say:
1. isSome(x) to get a boolean value that is false if the value of « is
NONE and true if the value of z is SOME y for some value y.
2. val0f(x) returns y if x is of the form SOME y. If x is NONE, then a
runtime error occurs.
For instance, we could write the function makeList1 of Fig. 4.7 as:
fun makeList1(infile, c)
if isSome(c) then
val0f(c): :makeList1(infile,input1(infile))
else nil;
‘The remainder of makeList1 handles the case where we have successfully
read a character c from the instream, and SOME c is the second argument of
makeList1. We extract the character c itself from the pattern SOME c and
make this character the head of the list being formed. The tail of the list is
what we get by applying input to get a character option from the instream
and recursively calling makeList1 with the instream and this character option
as arguments.
The function makeList, also defined in Fig. 4.7, Simply applies input! to its
instream and calls makeList1 with the instream and this first character option
as arguments. Function makeList1 will read character after character, until
NONE is returned by inputi. O
4.2.6 Lookahead on the Input
The function lookahead () is similar to input1, but lookahead does not
consume the character read from the input. It returns a character option, just
as inputi does.
Example 4.10: Let us again assume that infile is an instream opened for
reading the file of Fig. 4.5. Here are the responses to a use of lookahead followed
by a use of input:
lookahead (infile) ;
val it = SOME #1" : char option114 CHAPTER 4. INPUT AND OUTPUT
A Comparison Between Options and Lists
The type constructors option and List have certain similarities. In each
case, there are two identifiers, called value constructors, that are used to
build values of these types. The value constructor NONE is a value by itself,
just as nil is a value by itself. And just as the type of nilis ’a list (a
list of any type), the type of NONE is ’a option (an option of any type).
Each has a second value constructor: :: in the case of lists and SOME
in the case of options. However, :: is a binary, infix operator and can
be used recursively to create lists of arbitrary length. SOME is a unary,
prefix operator and may not be used recursively. That is, while a value
like 1: :nilis of type int list, value SOME SOME 1 is not of type
int option. Rather, it is a value of type int option option.
We shalll learn more about type constructors like these in Section 6.2,
when we take up the matter of datatypes. Both list and option are
actually examples of datatypes that are provided by the ML system, rather
than being defined by the user, as most datatypes are.
input1(infile) ;
val it = SOME #1” : char option
Notice that while lookahead reports the first character on the instream, that
character is not read from the instream. Rather, it remains to be consumed by
the later call to input1. A following call to either lookahead or input would
return the second character, SOME 2. Of course lookahead would leave 2 on
the instream; input1 would not.
The function canInput(f, i) is a predicate that returns true if there are
at least i characters currently available on the instream f. For instance,
canInput(infile,1)
returns false exactly when lookahead (infile) returns NONE.
4.2.7 Closing Instreams
We may close a file that has been opened for reading with the command
closeIn()
Here, the file argument is an instream that was returned by openIn. It is
generally not essential that files be closed. The ML system will handle files
correctly when the program terminates.4.2. READING INPUT FROM A FILE ay
4.2.8 Exercises for Section 4.2
Exercise 4.2.1: Write expressions that will do the following.
* a) Open file zap for reading.
* b) Close the file whose instream is int.
c) Read 5 characters from the instream in2.
* d) Read a line of text from the instream in3.
c) Read the entire file from instream in4.
f) Find the first character waiting on the standard input without consuming
it.
*1 g) Find how many characters are presently waiting on the standard input.
Exercise 4.2.2: Suppose we have a file with the characters
abe
de
f
Suppose the file has been opened for reading and infile is a variable whose
value is the instream for this file. Tell what. happens if the following commands
are executed repeatedly.
a) val x = input(infile);
b) val x = inputi(infile);
*c) val x = inputN(infile,2);
d) val x = inputN(infile,5);
*e) val x = inputLine(infile);
f) val x = lookahead(infile);
Exercise 4.2.3: Give the types of the following expressions:
* a) SOME ()
b) SOME 123
* c) SOME NONE
d) fun £() = SOME true;
*e) fun £(NONE) = 0 | £(SOME i) = i;116 CHAPTER 4. INPUT AND OUTPUT
Exercise 4.2.4: Read a file of characters, treating it as a sequence of words,
which are sequences of consecutive non-white-space characters. Each word is
followed by either a single white space character or the end-of-file, so two or
more consecutive white spaces indicate there is an empty word between them.
Return a list of the words in the file.
Exercise 4.2.5: Design the following calendar-printing function. Take as in-
put a month, the day of the first of that month, and the number of days in
the month. Months and days are abbreviated by their first three letters. The
month, day, and number of days are each separated by a single white-space
character. For example, a request to print the calendar for a September in
which the first of the month is on a Thursday would be
Sep Thu 30
Print the calendar as:
1. A row with the month (full name) indented by three tabs.
2. A blank row.
3. A row with the names of the days (three-letter abbreviations) separated
by tabs.
4, As many rows as necessary, with the days printed in the proper columns.
For example, Fig. 4.8 shows the calendar desired for September when Sept. 1
falls on a Thursday.
September
Sun Mon Tue Wed Thu Fri Sat
1 2 3
4 5 6 7 8 9 10
14 12 13 14 15 16 17
18 19 20 21 23 24 25
26 27 28 29 30
Figure 4.8: Example calendar page
4.3 Output to Files
We saw in Section 4.1 how to print output on the standard output file. In this
section we shall learn more about output, including the type outstream, which
is the analog of an instream. We shall also study commands to write to any of
several outstreams.4.3. OUTPUT TO FILES a
4.3.1 Outstreams
We met the type instream in Section 4.2.1. Normally, an instream represents a
file of characters that has been opened for input. We may imagine an instream
to be represented by the token (normally an integer) used by the operating
system to refer to that file. However, we are not allowed by ML to see this
token or even to compare it to another instream value.
Similarly, there is a type outstream that normally represents a file of char-
acters that has been opened for output. We may also think of an outstream as
the internal token representing this file. We create an outstream when we open
a file for writing. While there is only one reading mode — read a file from the
beginning — there are two output modes:
1. Function openDut () opens a file for writing. The file is first made
empty, and a token of type outstream is returned.
2. Function openAppend() opens a file for appending. This function
does not empty the file; it leaves the file asit is. An outstreamis returned,
and future output operations on this outstream are added to the end of
the file.
These functions, like all the functions introduced in this section, are found
in the structure Text 10. Thus, we must either open Text10 or prepend Text IO
and a dot to the names of these functions. We continue to assume that Text 10
is opened.
Example 4.11: Here is an example of a file-opening statement.
val outfile = openOut("/u/ullman/foo'
val ouffile = - : outstream
Notice that the value of outfile is not shown; it is represented by a dash,
because it is an internal token whose value we are not allowed to know. Its
type is identified as outstream.
The effect of this openOut statement is to empty the file /u/ul1man/foo.
An outstream is returned and bound to the identifier outfile. In the future,
we can write characters onto the end of file /u/ullman/foo by referring to
outfile. O
4.3.2 Closing Outstreams
After writing to an outstream f, we can close the file by closeOut (f). Here
f is the outstream token that we received from openOut or openAppend when
we opened the file. Any characters waiting in the buffer for the file are written,
and the file is closed to further writing. Should we later try to write to an
outstream that has been closed (or try to read from a closed instream) we get
an error condition called an “exception” (see Section 5.2).118 CHAPTER 4. INPUT AND OUTPUT
Flushing the Output
‘The function flushOut () “flushes” the outstream to which it is
applied. We would normally use flushOut if the outstream were a terminal
or other device whose output is buffered. This command assures that
any characters waiting in the buffer are written at the time flushOut is
executed, even if we do not immediately close the outstream.
Example 4.12: We close the file /u/u11man/foo mentioned in Example 4.11
with closeOut (outfile). We refer to the file by outfile because that iden-
tifier was bound to the outstream for the file /u/u11man/foo in the call to
openOut that created outstream outfile. O
4.3.3 The output Command
Function output (f,s) appends the string to the end of the outstream f.
Unlike input, which has a variety of functions that obtain various prefixes of an
instream, essentially all output is done either with the output function or the
print function described in Section 4.1.
Example 4.13: Let us consider an example using function output to the
standard output stdOut (see the box on “Standard Input and Output”). In
Fig. 4.9 we see a version of the comb function that with each call prints a line
of X’s of length equal to the value of n in that call. The output helps us picture
the sequence of calls made by an initial call to comb(n,m).
Lines (1) and (2) are the function put (a), which prints n X’s and then a
newline on the standard output. We see in line (1) the basis case, where n = 0
and we print just the newline. In line (2) is the case where n > 0. We print
one X, then recursively print n — 1 more X’s and the newline.
Lines (3) through (6) are the modified function comb. On line (4) n X’s are
printed. Then, lines (5) and (6) do the normal recursion for comb. Finally,
line (7) is an example call to comb(5,2). The original call of line (7) and each
recursive call it makes is reflected by a line of X’s, and at the end is ML's
response to the original call. Note that because put prints to the standard
output, the X’s appear on the terminal mixed with ML’s own responses, which
also go to the terminal. O
© Note that Example 4.13 uses only single characters as output strings in
lines (1) and (2) of Fig. 4.9. In general, strings of any length may be used.
4.3.4 Exercises for Section 4.3
Exercise 4.3.1: Write expressions that will do the following.4.3. OUTPUT TO FILES
(1) fun put(0) = output (std0ut,"\n")
(2) | put(n) = (output (std0ut,"X"); put(n-1));
val put = fn: int + unit
(3) fun comb(n,m) = (
(4) put (n);
(5) if m=0 orelse m-n then 1
(6) else comb(n-1,m) + comb(n-1,m-1)
5
val comb = fn: int * int + int
(7) comb(5,2) 5
XXXXX
XXXX
XXX
XX
XX
x
x
XXX
XX
x
x
XX
XXXX
XXX
XX
x
x
XX
XXX
val it = 10 : int
Figure 4.9: Printing a profile of calls to comb
119120 CHAPTER 4. INPUT AND OUTPUT
Standard Input and Output
UNIX has a notion of a “standard” input, output, and error file for a
process, normally the terminal or window in which the process originates.
Structure TextIO provides names for these three standard files:
1. stdin is an instream, the standard input.
2. stdQut is an outstream, the standard output.
3. stdErr is an outstream, the standard error file.
None of these instreams and outstreams need to be opened. You may refer
to them, knowing that the surrounding operating system will direct the
characters to or from the appropriate place, such as your terminal.
Standard Input Can Interact With the snl Command
When using standard input, we must be careful how we call ML. If we use
the UNIX command sm1 10, then represent
digits ten and above by their decimal representation surrounded by parentheses.
For example, 570 in base 12 is 3(11)6; that is, 570 = 144 x 3+ 12x 11+6. You
should read from an instream infile and write to outstream outfile.
Exercise 4.3.3: Write a function that, given i prints 2' X’s, using i recursive
calls. Hint: Use an auxiliary function that computes the desired string before
the string is printed.4.4. CASE STUDY: SUMMING INTEGERS 121
4.4 Case Study: Summing Integers
In this extended example, the problem we shall address is how to read a list of
integers from a designated file and compute their sum. The following restric-
tions are assumed.
1. Integers are positive only.
2. Integers are separated by one or more characters that are not digits.
3. The last integer may or may not be followed by one or more nondigits
before the end of the file is reached.
The entire program appears in Fig. 4.10. In line (1), we open the TextI0
structure so that the file-reading functions will be available to us. Line (2) de-
fines a symbolic constant END, whose value is 1. This constant is a convenient
way for the various functions in Fig. 4.10 to signal that they have failed to find
any more integers on the instream and are ready to complete the sum.
Function digit in line (3) checks whether its argument lies between the
characters "0" and "9" in lexicographic order. Note that in the ASCII code,
the digits have consecutive codes.
Now, let us divide the task into some components. Our initial sketch of the
program, in a “Pidgin-Pascal” notation, is shown in Fig. 4.11.
Getting the next integer from the file is itself a complex task. First, there
may not be any more integers, since no digits may remain on the file. We shall
handle this situation by producing the integer —1. Note we can do so here
because of the assumption that there are no negative integers. Thus there can
be no ambiguity whether —1 is a legitimate integer — it cannot be. However,
to make the role of —1 more transparent, we use END, the variable defined on
line (2), in place of -1.
If there is an integer to be found, we can divide the process into two steps:
1. Skipping over characters to find the first digit, and
2. Reading subsequent digits until either a nondigit or the end of file is
encountered, computing the value of the integer as we go.
4.4.1 The Function startInt
The first of these operations is performed by the function startInt shown in
lines (4) through (8) of Fig, 4.10. Function startInt is mutually recursive
with an auxiliary function startInt1 that does most of the work. In line (4),
startInt gets the first character (as a character option) from the input file
and passes that character and the file to start Inti. The latter function first
checks on line (5) if the optional character is not present (i.e., the char option
is NONE). In that case the end of file has been reached and the special constant
END is returned to signal that there are no more integers.122
qa)
(2)
@)
(4)
(8)
(6)
mm
(8)
(9)
(10)
(41)
(12)
(13)
(4)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
CHAPTER 4. INPUT AND OUTPUT
open TextI0;
val END = “1;
fun digit(c) = c >= #"0" andalso c <= #"9";
fun startInt (file) = startInti(file, inputi(file))
and startInti(file, NONE) = END
| startInti(file, SOME c) =
if digit(c) then ord(c)-ord(#"0")
else startInt (file);
fun finishInt(i,file) =
if i = END then END
else finishInti(i,file, inputi(file))
and finishInti(i, file, NONE) =
| finishInti(i, file, SOME c)
if digit(c) then
finishInt (10*itord(c)-ord(#"0"), file)
i
else i;
fun getInt(file) = finishInt(startInt(file), file)
fun sumInts1(file) =
let
val i = getInt(file)
in
if i = END then 0
else i + sumIntsi(file)
end;
fun sumInts(filename) = sumInts1(openIn(filename)) ;
Figure 4.10: ML Code to read a file and sum all the integers found on that file4.4. CASE STUDY: SUMMING INTEGERS 123
Using Boolean Expressions Succinctly
If you are like the author, you are tempted to write a predicate like digit
as
fun digit(c) =
if c>= #’0? andalso c <= #9? then true
else false;
This code is correct but “illiterate.” When we write a function that returns
a boolean, we can use the test as the return value itself.
if there are no more integers on the file then
return 0.
else begin
get an integer from the file;
recursively sum the rest of the file;
return the sum of the first integer and
the rest of the file
end
Figure 4.11: Sketch of integer-summing program
If the character is present, then input returns SOME c for some character
c. In that case, at lines (6) through (8) startInt1 first checks if c is a digit. If
so, then the value of that digit is returned at line (7) by subtracting the ASCII
code for 0 from the ASCII code for the digit. The difference is the integer value
of the digit. But if the character c is not a digit, then start Int 1 instead ignores
this character and calls startInt to get the next character from the file.
4.4.2 The Function finishInt
Now we have a way to find the first digit of an integer and return the value of
that digit. Next, we need a function finishInt (i, file) that takes an integer
i, which represents the value of digits read so far, and reads as many more
consecutive digits as there are in the instream named file. Eventually the
value of the integer represented by this entire sequence of digits is produced
by finishInt. The key arithmetic point to remember is that if i is the value
of digits read so far, and we read one more digit, say d, then the value of the
integer up to the newly read digit is 101 + d.
Completing the integer that was begun by startInt is the task of fune-
tion finishInt and its mutually recursive auxiliary function finishInt1. At124 CHAPTER 4. INPUT AND OUTPUT
Signaling the End of the File
Admittedly, the style we have adopted for signally that no integers remain
on the file — returning —1 when no integer is found — is fraught with
danger and should be avoided. We shall learn two safer ways to repre-
sent the end of file. One is Exercise 6.2.4, where we consider the use of
datatypes, and another is Exercise 5.2.3, where we consider exceptions and
their handling. Still another way (discussed in Exercise 4.4.1) would be to
return an int option, which would either be NONE if there were no integer
remaining on the input or SOME i, if integer i were found.
line (10) of Fig. 4.10, finishInt tests if we are already at the end of file, be-
cause startInt has returned the special integer END. If so, then finishInt
also returns END to signal that there is no integer and the end of file has been
reached.
If the integer was successfully started by startInt, then finishInt reads
the next character from the instream and calls finishInt1 with this character
and the integer i read so far. If the next character is not there, because the end
of file has now been reached, then at line (12) finishInt1 returns i, the integer
found so far. But if there is another character c on the input, then finishInt1
tests if c is a digit at line (13). If so then on line (14) finishInt1 multiplies
the integer i by 10, and adds the value of the digit read, to get a new integer,
which we may call j. Also on line (14), finishInt1 calls finishInt recursively
with integer j and the same file
If, on the other hand, the character passed to finishInt1 is not a digit,
then the integer being read is finished. Function finishInt return the integer
ion line (15), and the recursion ends.
4.4.3 The Function getInt
The functions startInt and finishInt are combined as one function get Int
on line (16) of Fig. 4.10 that either reads the next integer from a file or returns
END if the end-of-file is encountered before any digits. If there are one or more
digits in the file, startInt will get the value of the first, and finishInt will
repeatedly multiply its integer argument by 10 and add in the value of the next
digit, eventually returning the value of the entire string of consecutive dij
This value is returned by getInt.
Should getInt be called when there is no integer left on the file, then
startInt will return END; so will finishInt, and therefore getInt returns
END.44, CASE STUDY: SUMMING INTEGERS 125
Using Auxiliary Functions
Contrast the approach used in Example 4.9, where we designed a recur-
sive auxiliary function makeList1 that was called once by its primary
function makeList with the approach used in Fig. 4.10, where the auxil-
iary functions startInt1 and finishInt1 were mutually recursive with
their corresponding primary functions. Either style could be used in either
example. For instance, on line (8) of Fig. 4.10 we could have start Int1
call itself with
else startInti(file, inputi(file));
The code would then look more like that. of Example 4.9. We would
follow the definition of startInt4 by the definition of startInt, and the
two functions would no longer be mutually recursive. Which approach is
preferable is a matter of taste.
44.4 The Function sumInts
Finally, we see in lines (17) through (21) of Fig. 4.10 the functions that do the
summing of integers. The work is really done by sumInts1, which at line (18)
reads an integer off the instream. If the integer is END (i.e., ~1), then we have
reached the end of the file and found no integer, so the sum of integers found
is 0. We handle this case in line (19). If a nonnegative integer is found, then
on line (20) we recursively call sumInts1 to sum the rest of the integers on the
file, and add the integer found first, to produce the correct sum.
The final touch is the function sumInts of line (21). This function takes a
string filename, which is the name of the file whose integers we must sum, and
applies opentn to it. The result is an instream representing the file. This in-
stream is passed to sumInts1 and, through it, to the other functions of Fig. 4.10
that read from the file,
4.4.5 Eager Evaluation
One might wonder whether all the complexity of Fig. 4.10 is really necessary.
For example, we could have converted the file into a string in one step, using
function input. We could then use explode to create a list of characters and
process this list. It might well be more convenient to process the list than to
read characters directly from the input file, as we did.
However, there is a disadvantage to doing so, because of ML's method of
parameter passing, which is call-by-value (recall Section 3.1). That is, the first
thing that would happen if we used input is that the entire file would be turned
into a string. But suppose we had agreed to terminate the list of integers not by126 CHAPTER 4. INPUT AND OUTPUT
the end of the file, but by some special character like #"e", which could appear
in the middle of the file. In that case, we would not need to read the part
of the file beyond the first #"e". However, should we start by using functions
input and explode to convert the file to a list of characters, we would wind
up reading the entire file whether or not we needed it. This style of computing
values is called eager evaluation. Its opposite, where parts of an argument such
as a list or file are evaluated only as needed, is called lazy evaluation.
ML follows the eager evaluation approach because of the call-by-value se-
mantics associated with its function arguments. However, by careful program-
ming, we can have a measure of lazy evaluation. Notice in particular that the
program of Fig. 4.10 reads one integer at a time, and it never creates a list
of all the integers before it sums them. More to the point, if we had used a
marker like #"e" to indicate the end of the group of integers to be summed,
and suitably modified the program of Fig. 4.10 to look for #"*e" instead of the
end-of-file, the program would never even read the file beyond the first #"e".
In truth, it should be pointed out that while ML argument evaluation is
“eager,” there are other aspects of the language design that are “lazy.” For
instance, we mentioned that the second argument of andalso or orelse is
evaluated “lazily,” that is, only if needed.
4.4.6 Exercises for Section 4.4
Exercise 4.4.1: Generalize the program of Fig. 4.10 to allow negative integers
on the input. We assume that negative integers are preceded by the minus sign
(-) rather than the tilde (~). Note that we not only have to recognize negative
integers, but we can no longer use —1 as a convenient value to indicate that the
end of file has been reached. Hint: Return an integer option that tells whether
the end of file has been reached.
Exercise 4.4.2: Rewrite makeList and makeList1 of Example 4.9 so the two
functions are mutually recursive
Exercise 4.4.3: Rewrite functions startInt and startInt1 of Fig. 4.10 so
they are not mutually recursive. Do the same for functions finishInt and
finishInt1.Chapter 5
More About Functions
This chapter continues the study of ML functions that we began in Chapter 3.
First, we introduce the “match,” a special kind of expression similar to the
pattern-matching diction that we studied in connection with function definitions
in Section 3.3. In fact, we shall see that function definitions using fun are really
a shorthand for a match.
‘Then we introduce the “exception,” a mechanism that lets us deal with those
functions for which certain values of its argument(s) do not allow the function
to return a sensible value. Then, we discuss two of the things that distinguish
ML from other programming languages: the ease with which one can write
1. Polymorphic functions — functions that accept arguments of different
types — and
2. Higher order functions — functions that take functions as arguments
and/or produce functions as return values.
We conclude the chapter with a discussion of ways to construct new functions
from old, especially the technique known as “Currying,” where we bind some
of the arguments of a function to make a new function.
5.1 Matches and Patterns
Patterns and the matching of patterns to expressions play a central role in ML
programming. In this chapter we look at some additional ways that patterns
are used.
1. We use patterns in matches, which resemble the sequence of patterns
and associated expressions that appear in function declarations using the
keyword fun. In turn, matches are essential components of128 CHAPTER 5. MORE ABOUT FUNCTIONS
(a) Function expressions. These allow us to define functions as values,
by using the keyword fn. An important use of these expressions
is to define anonymous functions that can be used as arguments to
higher-order functions without giving them a name; see Section 5.1.3.
(b) Case expressions. These are similar to the case-statements of Pascal
or the switch-statements of C.
(c) Exception-handlers, which we shall cover in Section 5.2.3. These
allow us to handle error conditions gracefully.
2. We use patterns in val-declarations, which we shall find are really much
more general than the bindings of values to single variables that we have
been using almost exclusively.
5.1.1 Matches
A match expression consists of one or more rules, which are pairs of the form
=>
The rules are separated by vertical bars, so the form of a match is:
=> |
=> |
=>
Each of the expressions following the =>’s must be of the same type, since any
one of them could become the value of the match.
‘The match is applied to a value v. We compare each pattern of the match
with v in order, until we find a pattern that matches v, say the ith pattern. This
match of a pattern with v binds values to each of the identifiers in the pattern,
in the manner discussed with regard to functions in Section 3.3.5. Identifiers in
the ith expression are then replaced by their associated values, and the resulting
value of the ith expression becomes the value of the match.
5.1.2. Using Matches to Define Functions
An alternative way to define a function f, without using the keyword fun, is
val rec f = fn
The keyword rec, short for “recursive,” is necessary only if the function f is
recursive, that is, if the identifier appears in one or more expressions of the
match. This keyword informs ML that any uses of f in the match refers to
the function f being defined recursively and is not an undefined or previously
defined variable.5.1. MATCHES AND PATTERNS 129
Matches That Don’t Cover All Possibilities
If there exist values that match none of the patterns, then ML will issue
a warning:
Warning: match not ehaustive
when the match is compiled. If the match is actually applied to a value
that does not match any of the patterns, then the exception Match is raised
and computation halts, unless the program has been designed to “handle”
the error as discussed in Section 5.2.3.
Example 5.1: Another way to write the definition of the function reverse of
Example 3.15 is
val rec reverse = fn
nil => nil |
reverse(xs) @ [x];
’a list + ‘a list
val reverse =
fn:
Here the keyword rec is necessary because reverse appears in the match itself.
Ina function definition like
val rec addOne = fn x => x+1;
the keyword rec is legal. However, it is unnecessary, and possibly confusing,
since the match consisting of a single pattern x and expression x+1 does not
mention the name add0ne.
As a general rule, any function definition using fun that has the form
fun f(P) =F | f(P2)= Fa | -- | (Pn) = Eni
is a shorthand for the val-declaration
val rec f = fn Py => Ey | Pp => Ep | ++ | Py => Ens
5.1.3. Anonymous Functions
We can also use a match, preceded by the keyword fn, as an anonymous function.
that is used once and thrown away. That is, if M is a match, then (fn M)
is a function without a name. It may be applied to an argument E by writing
(fn M)(E)
Example 5.2: Consider the expression
(fn x => x+1)(3)i
val it = 4: int130 CHAPTER 5. MORE ABOUT FUNCTIONS
Here, we use the simple one-pattern match of function addOne in Example 5.1,
but without giving it a name. It is applied to the argument 3, and produces
the valued. ©
The example above may be an uninteresting use of anonymous functions,
since its effect can be simulated more simply. However, in Section 5.4 we shall
see many places where anonymous functions are useful as arguments of higher-
order functions.
5.1.4 Case Expressions
The form of a case expression is
case of
‘The value of a case expression is found by matching, in order of appearance, each
pattern in the match against the value of the expression. As soon as a matching
pattern is found, the corresponding expression in the match is evaluated and
becomes the value of the case expression. If there is no pattern matching the
expression, then the exception Match is raised.
Example 5.3: One use of a case expression is to replace an auxiliary function
that has no use besides supporting some “primary” function. Since the case
expression has all the power of a match, it can easily replace one use of a
function. In Fig. 5.1 we see a rewrite of the pair of mutually recursive functions
startInt and startInt1 from Fig. 4.10. We have used the case statement to
replace startInt1 entirely.
(1) fun startint(infile) =
(2) case inputi(infile) of
(3) NONE => END |
(4) SOME c =>
(5) if digit(c) then ord(c) - ord(#
(6) else startInt(infile);
Figure 5.1: Function start Int using a case statement
In line (2), the expression for the case statement is given; it is input1 (file),
which we should recall gets a character option from the instream. There are
two possible cases. In line (3) we handle the case where NONE was returned by
input; i.e., the end of file has been reached and there is no character. In this
case, startInt must return the special value END to signal the end of file.
Lines (4) through (6) handle the case where a character was returned. The
pattern SOME c allows us to extract the character. If it is a digit, then the
value of that digit is returned on line (5). If the character is not a digit, then5.1. MATCHES AND PATTERNS 181
on line (6) we call startInt recursively to search further on the file for the first
digit. 0
5.1.5 If-Then-Else Expressions Revisited
The if-then-else expression, which we introduced in Section 2.1.6, is actually a
shorthand for a case expression. That is,
if E, then E2 else E3
stands for
case Ey of true => Ey | false => E3
And in turn, a case expression case E of M, where M is a match, is equiva-
lent to the function application (fn M)(B).
Example 5.4: One place where the underlying meaning of the if-then-else
expression needs to be understood is if we make an error and get a diagnostic
from the ML compiler. Without understanding what is going on, the response
of the compiler can seem rather mysterious. Suppose we confuse characters
with strings in the following expression:
if x #"a" and
false => "b", do not produce values of the same type. The first rule tells ML
to expect character results. Thus, when the second rule is encountered, the
error message
Error: types of rules don’t agree [tycon mismatch]
earlier rule(s): bool + char
this rule: bool + string
rule:
false => "b”
is produced. O132 CHAPTER 5. MORE ABOUT FUNCTIONS
5.1.6 Exercises for Section 5.1
Exercise 5.1.1: Rewrite the functions finishInt and finishInt1 from Fig.
4.10 to use a single function with a case statement.
Exercise 5.1.2: Write the following functions as values, using fn and a match.
* a) Function padd of Fig. 3.27.
b) Function smult of Fig. 3.27.
©) Function pmult of Fig. 3.27.
d) Function sunPairs of Example 3.18.
* c) Function printList of Example 4.3.
f) Function merge of Fig. 3.12.
g) Function comb of Fig. 3.13.
Exercise 5.1.3: A year is a leap year if and only if it is divisible by 4, but
not by 100, unless it is also divisible by 400. Write a case expression that tells
whether year y is a leap year.
Exercise 5.1.4: Recall from Exercise 2.1.4 that expressions using orelse and
andalso can be rewritten as if-then-else expressions. We now find that ifthen-
else expressions can be rewritten as case expressions. Write:
* a) Eorelse F
b) Eandalso F
as case expressions.
5.2 Exceptions
Many functions are partial, meaning that they do not produce a value for some
of the possible arguments of the function's domain type. It is essential that we
be able to catch such errors, but the constructs given so far do not let us do
so. The canonical example of an erroneous argument is division by 0. We have
claimed that in ML an expression like a div b must invariably produce a value
of type integer. But what if b has the value 0? Will ML in fact produce an
integer as a result?
The fact is that, as in other languages, division by 0 produces an error. If we
do nothing to handle the error, it will stop the computation with an “uncaught
exception” message.
Example 5.5: Here are some of the operators we have seen and their response
when given operands for which they have no defined value.5.2. EXCEPTIONS 133
Real Arithmetic in ML
Real arithmetic in ML avoids raising exceptions by using two special con-
stants inf (infinity), and nan (“not a number”). For example, 5.0/0.0
produces the value inf, ~5.0/0.0 produces the value ~inf, and 0.0/0.0
and inf-inf both produce the value nan.
5 div 0;
‘uncaught exception Div
hd(nil: int list);
uncaught exception Empty
t1(nil
uncaught exception Empty
real list);
chr (500) ;
uncaught exception Chr
The first example is an integer division by zero, and the system raises the
exception Div. Note that the Div exception will be raised and will halt the
entire program whenever integer division by zero occurs. The division might
be explicit, as in these examples, or it may be a division by some expression
that happens to evaluate to 0. However, as we shalll see in Section 5.2.3, it is
possible to turn an exception into an appropriate value; this process is called
“handling” exceptions. Handling the exception Div keeps the computation
going by providing a value for the expression with denominator 0.
The next two examples apply the operators hd and t1 to get the head and
tail, respectively, of the empty list. Since the empty list has neither a head nor
a tail, both of these exceptions raise the built-in exception Empty. Note that an
expression consisting of a function like hd or t1 applied to nil, is illegal unless
we pick a type for the result, as we have done here.! We shall have more to say
about the rationale for this requirement in Section 5.3.1.
‘The last example applies the built-in operator chr to an integer that is too
big to represent a character; chr requires an integer in the range 0) to 255. Thus,
the exception Chr is raised. 0
was not
TOlder versions of ML allowed such function applications in situations where
necessary to know the type of the result.134 CHAPTER 5. MORE ABOUT FUNCTIONS
5.2.1 User-Defined Exceptions
We may also define our own exceptions and “raise” them in code we write
when an exceptional condition is discovered. The simplest form of an exception
declaration is
exception Foo;
exception Foo
Foo is thus declared to be the name of an exception. In the definition of a
function f, we can use an expression
raise Foo
to raise exception Foo when we find an erroneous input or other condition that
we associate in our minds with “Foo.”
If function f raises exception Foo during the running of a program, then f
returns no value at all. Rather, ML will halt execution and print the message
uncaught exception Foo
Note that the type of Foo is “exception,” written exn. The range type of
function f is generally not exn, but rather whatever type the function f returns
when no exception is raised.
Example 5.6: Reconsider the comb(n,m) function of Fig. 3.6 that computes
(2). This function is
fun comb(n,m) = (* assumes 0 <= m <= n *)
if m=0 orelse m=n then 1
else comb(n-1,m) + comb(n-1,m-1);
val comb = fn: int * int + int
We pointed out in Example 3.10 that this function was not designed to work
correctly in situations where either n was negative or m was outside the range
O-to-n. One approach to the problem is to define some exceptions and rewrite
comb to raise them when the input is improper. Figure 5.2 shows this modifi-
cation.
‘We begin by defining two exceptions, BadN and BadM. It is possible to define
several exceptions at one time as
exception BadN and BadM;
or more generally, any list of exceptions separated by the keyword and.
‘These exceptions are used in lines (2) and (3) of the function comb to
check for the erroneous input possibilities. The expressions raise BadN and
raise BadM, when executed, cause the function comb to terminate abnormally,
without returning an integer.5.2. EXCEPTIONS 135
exception BadN;
exception BadN
exception BadM;
exception BadM
(1) fun comb(n,m) =
(2) if nn then raise BadM
(4) else if m=0 orelse m=n then 1
(8) else comb(n-1,m) + comb(n-1,m-1
val comb = fr: int * int + int
comb(5,2) 5
val it = 10: int
comb (~1,,0)
uncaught exception BadN
comb(5,6) 5
uncaught exception BadM
Figure 5.2: Using exceptions to catch error conditions in comb
* Note that this situation violates the principle that a function invariably
returns a value of its range type. However, exceptions are the only viola-
tion of this principle in ML.
The rest of the function, in lines (4) and (5), can assume the inputs satisfy
0 of
The identifier is an exception constructor, essentially the name of an exception.
When we raise the exception, it takes an argument of this type.
Example 5.7: Let us define an exception constructor Foo that takes an argu-
ment of type string. The declaration
exception Foo of string;
exception Foo of string
When we raise exception Foo, it must take a string as argument. For instance,
if a function contains the phrase
raise Foo("bar")
and this portion of the function is executed, we get from the ML runtime system
the message
uncaught exception Foo
On the other hand, if we do not provide the string argument, just saying:
raise Foo
then we get an error message from the ML compiler when we try to compile
the function containing this phrase. The message
Error: argument of raise is not an exception [tycon mismatch]
raised: string + ern
in expression:
raise Foo
In its response, ML indicates that it does not even regard Foo by itself as an
exception. In the second line of the response it notes that the type of Foo is
string -> exn, that is, a function from strings to exceptions. O
5.2.3 Handling Exceptions
Raising an uncaught exception always stops computation. We may prefer that
when an exception is raised, there is an attempt to produce an appropriate
value and continue the computation. We can use an expression of the form
handle 5.2. EXCEPTIONS 137
to help in this process. Here, the expression E before the handle keyword is
one in which we fear that one or more exceptions may be raised. The match
takes exceptions as patterns and associates them with expressions of the same
type as E.
If E produces a value v and does not raise an exception, then the match is
not applied to v, and v is the result of the handle expression. However, if E
raises an exception, perhaps with arguments, then the match is applied. The
first pattern that matches the exception causes its associated expression to be
evaluated, and this value becomes the value of the handle expression. If none of
the patterns match, then the exception is uncaught at this point. The exception
may be handled by another, surrounding handle expression, or it may remain
uncaught, percolate up to the top level, and stop the computation.
Example 5.8: Let us reconsider the function comb of Fig. 5.2, where we at-
tempted to compute (") and catch situations where n < 0,m <0, orm >n.
In Fig. 5.2, our only response when we found an error in the arguments was to
raise one of two exceptions BadN and Bad and cause computation to halt.
‘A better approach is to declare an exception OutOfRange, which takes a pair
of integers as parameters. When we raise this exception, we let the function
arguments n and m be the arguments of the exception as well. We can then
handle the error as follows. For n = m = 0, we shall treat the value of (9) as 1.
Otherwise, we print an error message telling the user what values n and m had
when the error occurred. However, we let comb return the value 0 in the hope
that it will be possible for computation to proceed.”
Figure 5.3 shows the program. Line (1) declares the exception constructor
OutOfRange and says that its argument type is int*int, that is, a pair of
integers. Lines (2) through (6) define the function comb1, which is like comb in
Fig. 5.2. However, lines (3) and (4) detect possible errors and raise the exception
OutOfRange(n,m) so the arguments that caused the error will be transmitted
with the exception when it is raised.
In line (7) we define the function comb, which calls combi and then handles
the exceptions that are raised by combi. Line (8) shows the first rule of the
match, where both arguments are 0, and the result is 1. Lines (9) through (15)
handle all other cases of the exception. We print the message “out of range”
and the values of n and m as a side-effect.
‘The last expression, on line (15), is 0. Recall it is the value of the final
expression in a list of expressions that is returned. Thus the value returned by
comb is 0 in all exceptional cases besides m =n = 0.
Line (16) shows a correct use of comb. Here, comb1 returns the value 6 and
raises no exception. There is no exception to handle, so 6 is produced by comb,
and ML tells us that 6 is the value of it.
Line (17) shows an erroneous use of comb, where comb1 raises the exception
OutOfRange(3,4) at line (4). This exception fails to match the pattern on
2We should be very sure that no unexpected errors will be introduced by the chosen value
0, or hard-to-diagnose bugs may result.138 CHAPTER 5. MORE ABOUT FUNCTIONS
(1) exception OutOfRange of int*int;
exception OutOfRange of int * int
(2) fun combi(n,m) =
(3) if n <= 0 then raise OutQfRange(n,m)
(4) else if m< 0 orelse m > n then
raise OutOfRange(n,m)
(5) else if m=0 orelse m=n then 1
©) else combi(n-1,m) + combi(n-1,m-1);
val combi = fn : int * int + int
(7) fun comb(n,m) = combi(n,m) handle
(e) OutOfRange(0,0) => 1 |
(9) OutOfRange(n,m) => (
(10) print("out of range: n=");
(1) print (Int. tostring(n));
(12) print("
(13) print (Int. toString(m));
(aa) print("\n");
(15) 0
)
val comb = fn: int * int + int
(16) comb(4,2);
val it = 6: int
(17) comb(3,4);
out of range: n=3 m=4
val it = 0: int
(18) comb(0,0);
val it =
int
Figure 5.3: Combinatorial function handled by an exception5.2. EXCEPTIONS 139
line (8), but matches the pattern of line (9), which gives n the value 3 and m
the value 4. We see two lines of response. The first is the side-effect resulting
from the sequence of print-expressions in lines (9) through (14). The second is
the value of it, which is 0. This integer is the value returned by comb because
0 is the last expression on line (15).
Finally, line (18) shows an erroneous situation where the pattern of line (8)
is matched. The exception OutOfRange (0,0) is raised on line (3). It matches
the pattern on line (8), and the value 1 is produced. Thus, 1 becomes the value
of it in the ML response. There is no side-effect as there was when the pattern
of line (9) was the correct match.
5.2.4 Exceptions as Elements of an Environment
Let us trace the effect on the environment of the sequence of declarations in
Fig. 5.2 where we declared BadN and BadM to be exceptions. These effects
are shown in Fig. 5.4. Bach of the two exception declarations adds to the
environment. We show identifiers BadN and BadM bound to unidentified values.
That is, the associated values are internal symbols that are never seen by the
user; only identifiers declared to be exceptions are printed when an exception
is raised.
n 2 Added in response to
n 3 call to comb(5,2)
definition of | Added in response to
C5) comb J] | definition of comb
wi Added in response to
Beaty - exception BadM
x Added in response to
BadN = exception BadN
Prior environment.
Figure 5.4: Additions to environments for exceptions
When we define the function comb, its binding is a further addition to the
environment. Since the definitions of the two exceptions sit below it, comb
has access to these exceptions for its own code, as suggested by the arrows
in Fig. 5.4. We then show the further additions that occur when the call to
comb(5,2) is made. It adds boxes for its parameters as usual; these boxes will
disappear when the call returns.140 CHAPTER 5. MORE ABOUT FUNCTIONS
5.2.5 Local Exceptions
It is not necessary to declare exceptions outside a function. We can declare
them inside the function, using a let expression, so we know they will make
sense any time the function is used, regardless of whether or not the exceptions
are declared outside the function.
There is, however, a problem with locally defined exceptions. If we try to
handle them, the function that does the handling will not be in the scope in
which the exceptions were defined, and thus the match used by the handler will
not recognize the local exceptions, even if it uses the same identifier in one of
its patterns. The following example illustrates the problem.
fun comb2(n,m) =
let
exception OutOfRange of int+int
in
if n <= 0 then raise Out0fRange(n,m)
else if m<0 orelse mn then
raise Out0fRange(n,m)
else if m=0 orelse m=n then 1
else comb2(n-1,m) + comb2(n-1,m-1)
end;
fun comb(n,m) = comb2(n,m) handle
OutOfRange(0,0) => 1 |
OutOfRange(n,m) => (
print(“out of range: n=
print (Int.toString(n))
print(" m=");
print (Int.toString(m));
print ("\n");
0
d;
Error: nonconstructor applied to argument in pattern: OutofRange
Figure 5.5: Function comb2 raises an exception that cannot be handled by comb
Example 5.9: Suppose we try to write comb1 of Fig. 5.3, but with exception
OutOfRange local to comb1. We would find that we cannot handle this exception
in function comb when it is raised. Figure 5.5 illustrates this erroneous way to
use exceptions. We have omitted the global definition of OutOfRange that
appeared in line (1) of Fig. 5.3. In its place, function comb2, which replaces
comb1 of Fig. 5.3, declares a local exception with name OutOfRange.5.2. EXCEPTIONS 141
OutOfRange -
Added in
response to
nm value of m
call to
comb2
n value of n
definition
comb of comb Available
definition before and
comee of comb2 after call
to comb2
Figure 5.6: Local exceptions cannot be used after their function returns
Figure 5.6 suggests what would be the situation before, during, and after
the call of comb2 by comb. Before the call, the definitions of comb and comb? are
available. When comb2 is called, it creates new bindings for its parameters n
and m and its local exception OutOfRange. These bindings are available during
the execution of comb2, but when comb2 finishes, then the bindings for n, m, and
OutOfRange go away and are not available to comb. In particular, when comb
of Fig. 5.3 tries to handle exception OutOfRange, the identifier OutOfRange is
not defined.
In fact, the code of Fig. 5.5 is illegal. In response to the definition of comb
we get from the ML compiler the error message:
Error: non-constructor applied to argument in pattern: OutOfRange
as seen in Fig. 5.5. 0
5.2.6 Exercises for Section 5.2
Exercise 5.2.1: Write a function to return the third element of a list. Define
suitable exceptions to tell what is wrong in the cases that the response of the
function is not defined. Raise the appropriate exception in response to erroneous
inputs.
Exercise 5.2.2: Write a factorial function that produces 1 when its argument
is 0, produces 0 for a negative argument while printing an error message, and142 CHAPTER 5. MORE ABOUT FUNCTIONS
produces n! for a positive argument n. Organize your code so a function fact
does the work of computing n! and raises an exception Negative(n) ifn is a
negative integer.
*! Exercise 5.2.3: In Fig. 4.10 we wrote a program to read nonnegative integers
and signal the end of file with the integer —1. It is better to handle this
signal with an exception. Modify Fig. 4.10 by declaring an exception Eof and
raising it in function startInt when the end of file is found. Modify function
sumInts1 to handle Eof. Other functions can ignore Eof, passing the problem
to sumInts1.
*
Exercise 5.2.4: We can represent a matrix of reals by a list of lists. Each list
on the “main” list represents one row of the matrix. It is possible to compute
the determinant of a matrix by pivotal condensation, a technique where we
recursively eliminate the first row and the first column.’ The method can be
described as follows.
BASIS: If there is one row and column, then return the one element.
INDUCTION: If there are more than one row and column,
i. Normalize the first row by dividing each element by the first element, say
a, in the row.
ii. For each element Mj; not in the first row or column, subtract from My
the product of the first element in row i and the jth element in row 1.
‘These are the elements furthest above, and furthest to the left of Mi.
iii. Recursively compute the determinant of the matrix formed by eliminating
the first row and first column. The result is a times this determinant.
Recall a is the constant from step (i) that was originally in the upper left
corner of the matrix.
Write a collection of functions that implement the pivotal condensation algo-
rithm. Define suitable exceptions to catch errors, including
1. The case where a = 0 in step (i) and division by a would therefore yield
infinity in step (ii), and
2. Cases where the matrix is not originally square. That is, there are not as
many rows as columns, or there are unequal-length rows.
Hint: It helps to take this one in easy stages. Start with a function that
normalizes a row (list) by dividing each element by a given constant. Also,
write a function to subtract a multiple of one row from another. Then, write
3 This method is not often preferred for computing determinants, since when followed
blindly it can result in failure even in cases where the determinant is not zero (i.e., where the
matrix is nonsingular). It can be improved by permuting the rows at each recursive step 50
the pivot (element in the upper left corner) has as large a magnitude as possible.5.3. POLYMORPHIC FUNCTIONS 143
a function that takes a list of rows and subtracts from the tail of each row the
product of the head of the row and a given list. The latter is the heart of the
pivotal condensation process. The given row is the normalized tail of the first
row. When we multiply it (as a vector) by the head of a row and then subtract
the result from the tail of the same row (again, thinking of lists as vectors),
we are performing the basic operation required by the pivotal condensation
algorithm.
Exercise 5.2.5: Consider the following function:
) = Div
Match;
fun myFavoriteException("sall;
| myFavoriteException("joe'
* a) What is the type of this function?
!b) Both myFavoriteException("joe") and myFavor iteException("zzz")
seem to produce Match as the answer. What is the difference between the
results of these two calls?
5.3 Polymorphic Functions
We saw in previous chapters that sometimes a function requires arguments of
a particular type. Other times, arguments are not restricted to a type, or they
are partially restricted. For example, an argument might have type ’a list,
meaning a list of any one type of clement is required. The ability of a function
to allow arguments of different types is called polymorphism (“poly” = “many”;
“morph” = “form”), and such a function is called polymorphic.
In this section, we study what makes a function polymorphic, or conversely,
what forces an argument to be restricted to a single type. Before proceeding,
it is useful to remember some points about ML types.
© MLis strongly typed, meaning it is possible to determine the type of any
variable or the value returned by any function by examining the program,
but without running the program. Put another way, an ML program for
which it is not possible to determine the types of variables and function
return-values is an incorrect program.
© The algorithm whereby ML deduces the types of variables is complex and
beyond the scope of this book. In practice, it is usually easy to see what
ML is doing to discover types, as we discussed informally in Section 3.2.4.
* Although we must be able to tell the types of all variables in a complete
program, we can define functions whose types are partially or completely
flexible; these are the polymorphic functions.
Example 5.10: The extreme example of a polymorphic function is the identity
function. which we can define by144 CHAPTER 5. MORE ABOUT FUNCTIONS
fun identity(x) = x;
val identity = fn: ‘a 'a
This function simply produces its argument as its own result, and the argument
can be of any type whatsoever. ML observes that the type of the argument and
the result are the same, and so designates the type of the identity function as
Ja-> va.
We can use the identity function with anything as an argument. For in-
stance:
identity(2) 5
val it = 2: int
Here the type of the result is found to be an integer.
We can even give the identity function a function as an argument; it will
produce that function as result.
identity (ord);
val it = fn: char —> int
Although ML does not tell us specifically what function is returned, the fact
that the result is ord is suggested by the description of the type of the result,
a function from characters to integers.
We can even apply the function identity twice in the same expression, using
it on values of different types, as long as no type error is thereby introduced.
identity(2) + floor(identity(3.5))
val it =
2 int
Here, identity has been applied to a value of type int and another value of
type real in one expression.
Example 5.11: Suppose we have defined the function identity, as above.
Here is a new function f that is polymorphic to an extent:
fun f(x) =
if x<10 then identity
else rev;
val f = fn: int + 'a list + “a list
Note that rev is the built-in list-reversal function provided by ML. It is essen-
tially the function reverse from Fig. 3.26. Because rev requires an argument
that is a list of some type, its type is ’a list -> ’a list. Since f, like all
ML functions, must have a unique range type, the function identity within f
must also be applied only to types of the form ’a list, even though identity
could otherwise be applied to a more general type: ’a. 05.3. POLYMORPHIC FUNCTIONS 145
5.3.1 A Limitation on the Use of Polymorphic Functions
A type variable, such as ’a actually has two meanings that differ subtly.
1. A type variable ’a can say “for every type T, there is an instance of
this object with type T in place of *a.” Such a type variable is called
generalizable. The primary example of such a use is in descriptions of
the type of polymorphic functions. For instance, the type ’a->’a used
to describe the type of the function identity in Example 5.10 represents
such a type schema, where the function identity can be used with any
type, even in the same expression, as we saw in that example.
2. A type variable ’a can represent any one type that we choose. However,
once that type is selected, the type cannot change, even if we reuse the
object whose type was described using the type variable ’a. A type
variable of this kind is nongeneralizable. We shall defer an example of a
nongeneralizable type variable to Example 5.16.
Versions of ML prior to ML97 did not always distinguish between the two
meanings for type variables, and often there is little harm in blending the two.
However, because of certain technical problems that prevent compile-time de-
termination of types, which we recall is an essential feature of ML, the ML97
specification requires that expressions at the top level (i.e., expressions that
are not subexpressions of another expression) be such that the generalizable
interpretation is appropriate. Moreover, ML97 is conservative about allowing
the generalizable interpretation for type variables. As a result, only certain
kinds of expressions at the top level can have types involving type variables.
These expressions, called nonexpansive expressions, include function definitions
as a common case. In general, we can build nonexpansive expressions by the
following rules:
1. A constant or a variable is nonexpansive.
2. A function definition is nonexpansive.
3. A tuple (or more generally a record structure as described in Section 7.1)
of nonexpansive expressions is nonexpansive.
4, A nonexpansive expression may be preceded by a “constructor” that
is either an exception constructor or a data constructor belonging to a
datatype. The latter constructors are covered Section 6.2, but we give a
simple example of this form of nonexpansive expression in Example 5.14.
In addition, we can attach types to nonexpansive expressions with a colon
and a type expression, and we can use the keyword op where appropriate in
nonexpansive expressions.
Expressions that are not of these forms are expansive and not allowed to
have type variables. The error message that we get when, at the top level, we146 CHAPTER 5. MORE ABOUT FUNCTIONS
write an expansive expression with a type variable, accuses that type variable
of being “nongeneralizable,” i.e., of not having the first interpretation given at
the beginning of this section.
The matter of which expressions are permitted to have types with type
variables and which are not is complex. However, a few examples should suffice
to cover the cases that are likely to surface in practice.
Example 5.12: If we apply a function to an argument, and the type of the
result has type variables, then these type variables are nongeneralizable and
the expression is illegal. A simple example is:
identity (identity);
Error: nongeneralizable type variable
val it = "ZZ
Here, ML has recognized that the result is a function whose domain and range
types are arbitrary but the same (represented by °Z here). However, it does
not accept the expression, because it is expansive and has a type variable.
We can apply the identity to itself only if we provide a concrete type for the
domain and range. For instance:
identity(identity: int -> int);
val it = fn: int -> int
Now the resulting expression, which is the identity function on integers only,
has a type with no type variables, so the fact that the expression is expansive
becomes irrelevant. O
Example 5.13: We can safely build tuples of nonexpansive expressions. For
example:
(identity, identity);
val it = : (a + 'a) * (b> 'b)
Here, we have constructed an expression consisting of a pair of identity func-
tions. Notice that the type of the pair involves two distinet type variables ’a
and "b, because each of the identity functions could apply to a different type.
In contrast to tuple-formation, list formation is considered to build an ex-
pansive expression. Thus, an expression like (identity, identity] is illegal
inML. O
Example 5.14: Another way to build nonexpansive expressions is with the
data constructors that are associated with datatypes. We have not yet covered
the subject of datatypes (see Section 6.2), but we have met one example of
a datatype that ML provides for us in the top-level environment: the option.
That is, the names SOME and NONE are actually treated by the ML system as
data constructors of the datatype option. Thus, for example, the following
expression is legal.5.3. POLYMORPHIC FUNCTIONS Mr
When Does a Type Problem Arise?
Remember that the problem leading to the “nongeneralizable type vari-
able” error that we have been discussing in this section can arise only when
all three of the following conditions are met. by an expression:
1. The expression is at the top level; that is, the expression is not a
subexpression of some larger expression.
2. The type of the expression involves at least one type variable.
3. The form of the expression does not meet the conditions for it to be
nonexpansive.
If even one of these conditions is not met, we need not worry about the
type of the expression.
SOME identity;
val it = SOME fn : (’a ~ ’a) option
Here, we have applied the data constructor SOME to a nonexpansive expres-
sion — the variable identity that represents the identity function. The re-
sponse tells us that the result is an option of a function of the type of the
identity function, ’a->’a. 0
Example 5.15: Next, let us observe that we can use expressions that would
be illegal at the top level inside another expression, as long as the resulting
expression is legal. Consider the expression of Fig. 5.7.
let
val x = identity(identity)
in
x(Q1)
end;
val it = 1; int
Figure 5.7: A nongeneralizable type variable for a subexpression
Here, we have defined x to be the identity applied to itself. We saw in
Example 5.12 that such an expression is illegal at the top level. However, here
identity (identity) is a subexpression, so we do not yet trigger the objection
that there is a nongeneralizable type variable, even though the type variable
’a in the type ’a->’a for x is indeed nongeneralizable. Since the one type to148 CHAPTER 5. MORE ABOUT FUNCTIONS
which x applies is found in the expression x(1) to be int, ML finds no problem
with the expression as a whole. In fact, the complete expression of Fig. 5.7
has no type variables in its type, so the issue of nongeneralizable type variables
does not come up. O
let
val x = identity(identity)
in
(x(Q), x(a")
end;
Error: operator and operand don't agree [literal]
operator domain: int
operand: string
in expression
("a")
Figure 5.8: Trying to reuse a nongeneralizable type variable
Example 5.16: The variable x in Fig. 5.7 is nongeneralizable. That is, the
interpretation of its type variable is that one and only one type may ever be
substituted for that variable. We can see in Fig. 5.8 the effect of the nongener-
alizability of the type of x. There, we try to use x twice to stand for the identity
applied to different types. That is, the variable ’a in the type ?a->’a of x is,
bound to int when ML encounters the expression x(1). When x is next applied
to the argument "a" it is too late to change the value of ’a, so we are trying
to apply the identity function on integers to a string and get the error message
shown in Fig. 5.8.
To further emphasize the difference between generalizable and nongeneral-
izable type variables, consider that the following expression:
(Cidentity(1), identity("a"));
val it = (1,"a”) : int * string
is legal. The difference between this expression and the almost-identical Fig. 5.8
is that when we defined x in Fig. 5.8, even though the value of x is the identity
function, ML converted the interpretation of the type variable in x’s type from
generalizable to nongeneralizable. True, the system might have realized that x
was just the identity function, and allowed it to retain the generalizable inter-
pretation for its type variable. However, as we mentioned at the beginning of
this section, ML must be conservative about using the generalizable interpreta-
tion, or it will be impossible for the system to guarantee correct compile-time
type checking. O5.3. POLYMORPHIC FUNCTIONS M9
5.3.2 Operators that Restrict Polymorphism
Most of the operators that we have met prevent polymorphism in functions
where they are used. These “polymorphism-destroying” operators inclu
1. Arithmetic operators: +, -, #, and ~.
2. Division-related operators such as /, div, and mod.
3. The inequality comparison operators: <, <=, >=, and >. Note we exclude =
(equal-to) and <> (not-equal-to) from this group. They behave differently
from the inequality comparisons as far as polymorphism is concerned. We
shall discuss this matter in Section 5.3.4.
4. The boolean connectives: andalso, orelse, and not.
5. The string concatenation operator: *
6. Type conversion operators such as ord, chr, real, str, floor, ceiling,
round, and truncate.
All but groups (1) and (3) force their argument(s) and result to be of one
specific type. Groups (1) and (3) include operators that apply to several dif-
ferent types, but ML requires that the type be known from inspection of the
program. Thus, operators in groups (1) and (3) not only restrict their argu-
ments and results to one type, they frequently require us to indicate with a
colon what that type is.
5.3.3 Operators that Allow Polymorphism
We have seen several operators that allow polymorphism, although they some-
what restrict the types of their results and/or arguments. ‘Three classes of
operators in this category are:
1. Tuple operators, such as the tuple-forming operator, consisting of paren-
theses and commas, as (,,...,). Also in this group are the component-
reading operators, #1, #2, and so on.
2. The list operators ::, @, hd, and t1, the list constant nil, and brackets
used as the list-former [:- -].
3. The equality operators = and <>.
When we apply a tuple constructor, we get a tuple type of some sort. When
we apply a list-building operator, we are restricted to create a list type of some
sort. When we apply an equality operator, we restrict the arguments to be of
the same “equality type,” a concept we shall discuss shortly, in Section 5.3.4.
However, there are no other constraints forced on the types of operands or
results.150 CHAPTER 5. MORE ABOUT FUNCTIONS
Let us consider why list operators do not prohibit polymorphism. A similar
explanation applies to the tuple-forming operators. First, consider an expres-
sion like x +y. ML implements the addition operator by computing a new
value, the sum of z and y. To compute the sum, ML needs to know whether to
add integers or reals.
In contrast consider the cons operator :: . ML represents lists internally
in the conventional, linked-list fashion that we discussed in Section 3.5.2. Cells
consisting of a pair of pointers, the first to an element and the second to the
next cell, represent the list.
To apply the cons operator, the ML runtime system creates a new cell, puts
a pointer to the head in the first field of the cell, and puts a pointer to the tail
in the second field of the cell. Notice that with this scheme, the operation is
performed in exactly the same way regardless of the types of the head and tail.
Of course, ML requires that it be able to deduce the types of head and tail
before running the program and requires that they be compatible types (i.e, if
the head is of type T, then the tail is of type T list).
5.3.4 The Equality Operators
Now let us look at the equality operators = and <>. ML defines a class of
types called equality types, which are those that allow equality to be tested
among values of that type. Most basic types — integer, boolean, character,
and string — are equality types.' Two ways to form more equality types are:
1. Forming products of equality types (for tuples).
2. Forming a list whose elements are of an equality type.
Note that rules (1) and (2) can be applied recursively. So, for example,
int * int isan equality type, (int * int) list is an equality type,
int list + string
is an equality type, and so on. We shall also see user-defined datatypes in
Section 6.2, and some of these new types will be equality types as well.
Example 5.17: Let us define two variables to be pairs of integers.
val x = (1,2);
val x = (1,2) : int * int
val y = (2,3);
val y = (2,3) : int * int
‘Remember, however, that the reals are not an equality type, for reasons we discussed in
Section 2.1.45.3. POLYMORPHIC FUNCTIONS a
‘Then we can compare these values, for instance:
xey;
val it = false : bool
x = (1,2);
val it = true : bool
Similarly, we could define and compare lists, as:
val L = [1,2,3];
val L = [1,2,3) : int list
val M = [2,3];
Then we can compare as follows:
LOM
val it = true : bool
M;
val it = true : bool
Notice in the last example that ML evaluates expressions before testing for
equality, so it discovers that the expression 1::M denotes the same list as is
denoted by the variable L. 0
On the other hand, functions cannot be compared for equality even though
we might think that two functions should be equal if they do exactly the same
thing on all inputs. Any type involving a function is not an equality type.
Example 5.18: Suppose we write an expression such as
identity = identity;
where identity is the function defined in Example 5.10. Then we get the error
message shown in Fig. 5.9.
Line (1) of Fig. 5.9 says that the type of the operator (the = sign) does
not agree with the type of its operand (the pair of identity functions). The
problem is that equality or inequality can only be tested among pairs of the
same equality type, and no function type is an equality type. Thus, even
though the two uses of identity as a function name obviously denote the same
function, the comparison is not legal in ML.
Line (2) further explains that the operator = requires arguments of the same
equality type, here denoted by ?Z.152 CHAPTER 5. MORE ABOUT FUNCTIONS
(1) Error: operator and operand don’t agree [equality type required]
(2) operator domain: "2 * "Z
(3) operand: (’Y + 'Y) * (X + ’X)
(4) _ in expression:
(5) = (identity, identity)
Figure 5.9: Functions cannot be compared for equality
Remember that type variables whose values are restricted to be an equal-
type are distinguished by having names that begin with two quote
marks rather than only one.
Line (3) points out that the actual pair of arguments given is two polymorphic
functions. One function is from some type ’X to the same type, and the second
is from some type ’Y to that same type.
« Note that there is no reason to believe that ’X and ’Y are the same type.
As with polymorphic functions in general, two uses of the identity function
need not apply to the same type.
Finally, lines (4) and (5) indicate that the error was in the expression
identity = identity
However, it gives the operator and operands in prefix form, where the operator
is applied in the same way a function is applied to its argument. 0
Example 5.19: To explore further the effect of an = or <> comparison on
the set of permissible types, let us reconsider the two versions of the function
reverse that we developed in Examples 3.8 and 3.15. These are repeated in
Fig. 5.10(a) and (b), as functions revi and rev2, respectively. However, here
we have used the correct type variable name ?*a (with two quotes) that ML
uses to describe the type of the function in Fig. 5.10(a). This type name tells
us that any type can be used, provided it is an equality type.
In Fig. 5.10(b), we see the ML response telling us that the function rev2 can
take an argument of any type whatsoever, regardless of whether it is an equality
type. If we give each of these programs a list whose elements are from one
equality type, both functions produce the same answer. The difference shows
up, however, if we apply each function to a list whose elements are chosen from
one non-equality type. For instance, consider a function call of the form
reverse([floor, trunc, ceil]);
where reverse can be either revi or rev2. Each of the elements on the list are
functions from reals to integers that. we discussed in Section 2.2.2. Thus, the
type of elements on the list is5.3. POLYMORPHIC FUNCTIONS 153
(1) fun revi(L) =
(2) if L = nil then nil
(3) else revi(t1(L)) @ [ha(L)];
val revt = fn: "a list + "a list
(a) Reversal using an equality comparison
(4) fun rev2(nil) = nil
(8) | rev2(x::xs) = rev2(xs) @ [x]
val rev2 = fn: ‘a list + 'a list
(b) Reversal without using an equality comparison
Figure 5.10: Two functions for reversing a list
fn: real -> int
If we use rev2, the function of Fig. 5.10(b), then the list will be reversed
normally, which yields the list [ceil, trunc, floor]. The ML response is:
rev2([floor, trunc, ceil]);
val it = [fn,fn,fn] : (real —> int) list
That is, ML tells us that the result is a list of three functions, each from reals
to integers.
(1) Error: operator and operand don’t agree [equality type required]
(2) operator domain: "Z list
(3) operand: (real —> int) list
(4) in expression:
(5) revt floor :: trune :: ceil: nil
Figure 5.11: Error response when list reversal requires an equality type
However, if we use revi, the function of Fig. 5.10(a), then we get the error
message in Fig. 5.11. This message is similar to that in Fig. 5.9. Line (2)
refers to the operator revi, which takes as an argument a ’7Z List, that is,
a list whose elements are from any one equality type. Line (3) says that the
argument actually found, which is [floor, trunc, cei], is a list of functions
from reals to integers. This type, being a function type, is not an equality type
and therefore is not suitable as the type ’’Z. Lines (4) and (5) indicate the154 CHAPTER 5. MORE ABOUT FUNCTIONS
offending expression. Note that lists are represented by :: and nil rather than
by square brackets.
We may well wonder why the function revi of Fig. 5.10(a) requires an
equality type. The reason is found in line (2), where the comparison L=nil
occurs. If list L is to be tested for equality to something, then surely L must be
of an equality type, which means its elements must be chosen from an equality
type.
Pin contrast, ine (4) in Fig. 5.10(b) makes essentially the same test by match-
ing L to the pattern nil. Recalling the discussion in Section 3.3.5 about how
ML matches patterns, we see that here we are not testing for equality of L to
nil. Rather, we are matching the expression tree for the value that L cur-
rently has, to the one-node tree for the expression nil. ML can match trees
without testing for equality of anything except constants of the basic types and
identifiers.
You may think that there is something wrong with this analysis and observe
that in line (2) of Fig. 5.10(a) we don’t really need to test equality of elements
to compare a list L with nil. That is quite true, although if we had replaced
the test of line (2) by a test for equality to a list other than nil, for instance
L=[1,2], then L would surely have to be of an equality type. The designers
of ML have chosen to infer that an equality type is needed by the presence of
an operator = or <>, and they have chosen not to consider equality to nil as a
special case. You may regard that choice as either “a bug or a feature” of ML,
as you wish.
ML has a built-in function nul that tests whether a list is empty without
requiring that list to be of an equality type. We could write line (2) of
Fig. 5.10(a) as
(2) if null(L) then nil
and then the function revi of Fig. 5.10(a) would not require an equality
type. Its type would be ’a list -> ’a list, just like Fig. 5.10(b).
5.3.5 Exercises for Section 5.3
Exercise 5.3.1: Let revi be the function of Fig. 5.10(a) and rev2 the function
of Fig. 5.10(b). What is the result of the following calls?
*a) revi([(revi: int list -> int list), revi])
b) rev2([(revi: int list -> int list), revi])
©) revi ((revi,revi])
* d) rev2([rev2,rev2])
e) revi((chr,chr])5.3. POLYMORPHIC FUNCTIONS aoe
* f) rev2([chr,chr])
g) revi([chr,ord])
h) rev2(Cchr, ord] )
Exercise 5.3.2: We can restrict polymorphic types (type expressions with
variables) by: (i) equating type variables, (ii) replacing a type variable by a
constant type, or (iii) replacing a type variable by a nonconstant expression.
Give an example of each kind of restriction for the following type expressions.
a) as bint
b) Ca list) * (’b list)
! Exercise 5.3.3: Suppose f(:r,y,z) is a function. Give an example of a defi-
nition of f that would cause the argument of f to have each of the following
types.
*a) aed * Ca -> 1d)
b) ae a int
*c) ta list * ’b © ?a
d) Ca * ’b) * ’a list * ’b list
Exercise 5.3.4: Tell whether or not each of the following types is an equality
type.
*a) int * string list
b) (int -> char) * string
*c) int -> string -> unit
*d) real * (string * string) list
Exercise 5.3.5: Let L have the value [(1,2), (3,4)], let M have the value
(1,2), and let N have the value (3,4). Which of the following equality tests
have the value true?
* a) L = M:N)
= Le(N]
* c) ((1,2)]@(N] = Leni2
b) M:
d) Ni:L = (3,4):: sil156 CHAPTER 5. MORE ABOUT FUNCTIONS
fun f(nil) = nil
1 £¢0)) = Cx)
1 fxrsy::zs) = [x,y];
fun g(x,y) = (£(x), £(y))5
fun h(x,y) =
let val v = f(nil) in (x::
» yitv) end;
Figure 5.12: Functions for Exercise 5.3.7
* Exercise 5.3.6: If the ML runtime system applies the cons operator without
looking at the elements of the list, how can it be sure the types of the head and
tail are compatible?
! Exercise 5.3.7: In Fig. 5.12 are three functions, f, g, and h. Function f
takes any list and returns the list with the third and subsequent elements, if
they exist, deleted. Function g applies f to a pair of arguments and returns
the pair of results, Function h computes a local value v by applying f to nil
(which returns an empty list) and then conses the two arguments of h to v. For
each of the expressions below, indicate whether it is legal, and if not, what is
the error?
* a) g((1,2,3), ["a"])
*b) g([1,2,3], nil).
©) g(lt.4], (1).
4) (C11, (1.01).
*e) a(t, 2).
f) h(t, "a").
*g) h(nil, nil).
h) R((1], nil).
5.4 Higher-Order Functions
A typical function has parameters that represent “data.” That is, the param-
eters are of some basic type like real, or they are lists or tuples of basic types,
lists or tuples of those, and so on. However, it is also possible for parameters
or results of functions to have function tvpes. In Section 5.3. we met. some5.4. HIGHER-ORDER FUNCTIONS wr
functions that can take arguments of other types, including function types. Ex-
amples are the identity function, which can take an argument of anv type, or
rev2 of Fig. 5.10(b), which is able to reverse a list of functions.
Functions that take functions as arguments and/or produce functions as
values are called higher-order functions. ML makes it easy to define higher-
order functions. In contrast, the mechanisms in conventional languages for
defining and using higher-order functions tend to be cumbersome, and there
may be some limitations on the power of these mechanisms. For example, it
may not be possible to define a function like identity that works on values of
any type whatsoever.
Example 5.20: Let us consider a higher-order function that is often used as
an example for conventional programming languages: numerical integration by
the trapezoidal rule. The idea is to compute the (approximate) integral of some
function f(z) between limits a and 6 — that is, [? f(x)de — by dividing the
line from a to b into n equal parts for some n. We then approximate the integral
as the sum of the areas of the n trapezoids that are suggested by Fig. 5.13 for
the case n= 3.
Figure 5.13: Integration by the trapezoidal rule
In more detail, let 6 = (b—a)/n. Then the ith trapezoid has width 5 and
runs from a+ (i —1)6 to a+i6. The area of the ith trapezoid is 6 times the
average of the two vertical sides, that is:
5(s(a+ (i= 1)6) + F(a+i8))/2
Figure 5.14 shows the function trap(a,b,n,F) that takes two real numbers,
the limits a and b, an integer n (the number of trapezoids to use), and a function
F to be integrated. As we cannot easily iterate from 1 to n and thereby sum
the areas of all the trapezoids, our ML function will use an equivalent recursive
strategy. The function trap computes the arca of the first trapezoid only. Tt158 CHAPTER 5. MORE ABOUT FUNCTIONS
The Order of a Function
Technically, the order of a function is defined by the following induetion.
BASIS: A function is “first-order” if its arguments and result are all “data,”
that is, not functions.
INDUCTION: A function is of order one more than the largest of the
orders of its arguments and result. Note that there are some functions,
like the identity function, that do not get an order by this induction, and
are therefore of “infinite order.”
then computes a new lower limit that is one trapezoid’s width to the right of
the old lower limit a, and decreases n by 1. A recursive call with the new values
of a and n sums the areas of the remaining trapezoids.
fun trap(a,b,n,F) =
qa) if n<=0 orelse b-a<=0.0 then 0.0
else
let
(2) val delta = (b-a)/real(n)
in
(3) delta (F(a)+F(atdelta))/2.0 +
@) trap(a+delta,b,n-1,F)
end;
val trap = fn: real * real * int * (real + real) -+ real
Figure 5.14: Function implementing the trapezoidal rule
In line (1) of Fig. 5.14 we test for the basis case, where n = 0 and b = a.
Then the value of the integral is 0. However, at the same time we handle data
errors, where n or 6 — a is negative, or where one of n and b— a but not the
other is 0. These errors can only occur on the initial call to trap, and we really
should catch them with exceptions, rather than by returning 0 as we do.
If we are not at the basis case, then in line (2) we compute the local variable
delta to be 1/nth of the width of the range of integration; that is, delta is the
width of each trapezoid.® Lines (3) and (4) evaluate the integral. In line (3)
we compute the area of the first trapezoid, multiplying delta by the sum of
the heights of the sides — F(a) and F(atdelta) — and then dividing by 2.
5The value of deita should be the same at each call to trap, and we leave it as an exercise
to rewrite the function so it evaluates delta only once. However, reevaluating delta for
each trapezoid does have the advantage of preventing the accumulation of roundoff errors in
situations where the value of delta cannot be represented precisely in the computer.5.4. HIGHER-ORDER FUNCTIONS 159
Simulating Iterations by Recursions
The reader should examine the “trick” of Fig. 5.14 carefully, because it is a
common way to convert from a loop in an iterative language to a recursive
function in a functional language. The general idea is to write a function
that, as a basis case, tests if the loop is done. For the induction it does
one iteration of the loop and then calls itself recursively to do whatever
iterations of the loop remain. The arguments of the function are the loop
index and any other variables that are needed in the loop. The hard part
in designing the function often is deciding how to express the result of the
loop as a value to be returned by the function.
Line (4) adds to this area the result of the recursive call on the range that
excludes the first trapezoid.
Note the type of the function trap as described in the ML response. It is
a function that takes a 4-tuple for an argument; the four components (a, b,
n, and F) are respectively of types real, real, int, and real -> real. The
result of the function trap is a real.
‘As an example of a use of the function trap, let us define a suitable function
F, such as
fun square(x:real) = x#x;
val square = fn : real + real
Then, we can call, for instance,
trap(0.0, 1.0, 8, square);
val it = 3359375 : real
This call asks for the integral } 22dz, whose exact value is 1/3. We divide the
range into 8 parts, and the result is high by less than 1%.
5.4.1 Some Common Higher-Order Functions
We shall now introduce three useful higher-order functions. The first two are
actually present as ML built-in functions, although they appear in a form some-
what different from the form we use here.
1. The map function takes a function F and a list [a,,a2,..-,dn), and pro-
duces the list [F(a1), F(a2),---»F(an)]- That is, it applies F to each
clement of the list and returns the list of resulting values. This function
is known to Lisp users as mapcar. Since there is in ML a function map
that is similar in spirit but different in type, we shall use simpleMap for
our initial version of map. In Section 5.6.3 we cover the ML version of
map.160 CHAPTER 5. MORE ABOUT FUNCTIONS
2. The reduce function takes a function F with two arguments and a list
[a1,a2,...,a,]. The function F normally is assumed to compute some
associative operation such as addition, that is, F(r,y) = x+y. The
result of reduce on F and [a1,02,... dn] is
F(a, F(a2, F(-++, F(@n-1,4n) ***)))
‘Thinking of F as an associative binary infix operator, we have the simpler
expression a; Fa2F --- Fan. For example, if F is the sum function, then
reduce(F, (a1,@2,...,@n}) is a, + @2 +--+ +n, the sum of the elements
on the list. ML has two functions foldl and foldr (fold from the left
or right) that are similar in spirit to the function reduce that we shall
design. The latter functions are covered in Section 5.6.4.
3. The function filter takes a predicate P, that is, a function whose value
is boolean, and a list a1,a2,...,4]. The result is the list of all those
elements on the given list that satisfy the predicate P.
5.4.2 A Simple Map Function
We can define a simple version of the map function as follows.
fun simpleMap(F,nil) = nil
| simpleMap(F,x::xs) = F(x)::simpleMap(F,xs)
val map = fn: ('a-+ 'b) * 'a list + % list
In the first line we sce that if the list is empty, then there are no elements to
apply the function F to, so simpleMap returns the empty list. The second line
covers the inductive case, where we apply F to the head of the list and then
recursively apply simpleMap to the same function F and the tail of the list.
The result is assembled by taking F(z), that is, F applied to the head element,
and following it by the result of applying F to all the other elements of the list.
Notice the type of simpleMap. It has two parameters, the first of which is
a function F from some type ’a to a possibly different type ’b. The second
parameter is a list of elements of the type ’a, which is the type F expects for its
argument. The result of simpleMap is of type ’b list, that is, a list of elements
of the range type of function F. We see that simpleMap is as polymorphic as it
can be; it only requires that the list elements be of the type that the function
F expects.
Example 5.21: Let us define the function square to produce the square of a
real, as
fun square(x:real) = x*x;
val square = fn : real + real5.4. HIGHER-ORDER FUNCTIONS 161
‘Then we may apply square to each element of a list of reals by using simpleMap
as follows.
simpleMap(square, [1.0, 2.0, 3.01);
al it = [1.0,4.0,9.0] : real list
That is, simpleMap applies square to each of 1.0, 2.0, and 3.0 in turn and
produces the list of their squares. O
Example 5.22: The function to which simpleMap is applied need not be some-
thing we write; it could be a suitable built-in function. For instance, ~, the unary
minus operator, has the form we expect for a function used as an argument of
simpleMap. We can write
simpleMap(~, [1,2,3]);
val it = [°1,~2,-3) : int list
This application of simpleMap has negated each element of the given list. O
If we want to apply simpleMap to a function that we must define, we need
not write the definition of that function separately and give it a name as we did
for square in Example 5.21. Just as we may write the value of an integer, say 23,
without giving it a name, we may express the value of a function anonymously.
We saw how to do so in Section 5.1.2. We write the function as the keyword
fn (not to be confused with fun) followed by a match. Recall that a match is
written as one or more groups consisting of a pattern, the symbol => (not to
be confused with ->), and an expression that is the value of the function for
inputs that match the pattern. If there is more than one group, the groups are
separated by vertical bars.
Example 5.23: We can apply square to each member of a real list without
actually defining square to be the name of the function, as follows.
simpleMap(fn x => x#x, [1.0, 2.0, 3.0]);
val it = [1.0,4.0,9.0] : real list
This anonymous function uses only one pattern, x, and the result of the function
for this pattern is 2°.
Notice that in the definition of the squaring function as a value,
fn x => xex
we did not have to declare x to be real. ML was able to figure out that * repre-
sents real multiplication from the fact that the second parameter of simpleMap
is a real list.162 CHAPTER 5. MORE ABOUT FUNCTIONS
5.4.3 The Function reduce
Another useful higher-order function is one we shall call reduce. It is related to,
but different from, functions fold1 and foldr in the ML top-level environment;
we discuss the latter functions in Section 5.6.4. Our function reduce takes a
function F of two arguments and a nonempty list (a1, @2,...,@n]. A recursive
definition of the result of reducing the list by function F is:
BASIS: If n = 1, that is, the list is a single element a, then the result is a.
INDUCTION: If n > 1, then let 6 be the result of reducing the tail of the list,
which is [a2,a3,..-,@n], by function F. Then the reduction of the whole list
[ai,a2,...,an] by F is F(a1,b).
Example 5.24: Usually the function F defines an associative operator, in
which case it does not matter in what order we group the list elements. For
instance:
1. The reduction of a list with F equal to the addition function produces
the sum of the elements of the list.
2. The reduction by the product function produces the product of the ele-
ments of the list.
3. The reduction by the logical AND operator produces the value true if all
the elements of a boolean list are true and produces false otherwise.
4. The reduction by the function max (larger of two elements) produces the
largest element on the list.
exception EmptyList;
exception EmptyList
(1) fun reduce(F,nil) = raise EmptyList
(2) | reduce(F,{a]) = a
(3) | reduce(F,x::xs) = F(x, reduce(F,xs));
val reduce = fn: ('a * ’a—> 'a) * ‘alist + “a
Figure 5.15: The function reduce
An implementation of the function reduce is shown in Fig. 5.15. Since
reduce does not make sense on the empty list, we create an exception EmptyList
and raise it at line (1) if the second argument of reduce is nil. Next, line (2)
says that if the list has a single element a, then that element is the value of
reduce regardless of the function used.5.4. HIGHER-ORDER FUNCTIONS 163
‘© By intercepting lists of length 1 at line (2), we avoid ever calling reduce
recursively on an empty list, which would cause an error.
Finally, line (3) implements the inductive step. We reduce the tail of the
given list, using the function F, and then apply F to the head and the result
of this reduction.
Notice the type of reduce. It is a function that takes as first parameter a
function F, both of whose parameters are of the same type a and whose result
is also of this type. The second parameter of reduce is a list of elements of
type ’a, and the result of reduce is also of type a.
‘These equalities of type are inferred by ML as follows. F is used with the
result of reduce as its second argument in line (3), so the result of reduce and
the second parameter of F must be of the same type, say ’a. In line (2) we see
that the elements of the list can be the result of reduce, which says that the
element type is also ’a. We see in line (3) that elements of the list can also be
the first argument of F, which tells us that the first parameter of F is also of
type ’a. Finally, from line (3) we see that the result types of reduce and F are
the same, so F produces a value of type ’a as well.
Example 5.25: This example illustrates the use of function reduce. It also
uses simpleMap and in general illustrates how one can program using higher-
order functions effectively.
The variance of a list of reals [a1,a2,...,an] is the average of the squares
minus the square of the average. More precisely, one formula for the variance
is
n n 2
(Ya)/m- (Lan) (41)
ist t=
The variance is a measure of the amount by which the elements of a list differ
from their average value. In fact, an equivalent formula for the variance is the
average of the squares of the differences between each clement and the average
element. In other words, the variance may also be written (S7jL, (ai — @)?)/n,
where @ is the average element, or @= (S77 ai)/n.
The square root of the variance, called the standard deviation, represents
the amount by which a typical element differs from the average. For example,
if all the elements are the same then the variance and standard deviation are
0. If half the elements are 10.0 while the other half are 20.0, then each element
differs from the average (15.0) by 5.0, so the variance is 25.0 and the standard
deviation is 5.0.
We can evaluate Formula (5.1) for the variance using the higher-order func-
tions simpleMap and reduce as follows. Suppose we have function square
to take the square of a real and function plus to sum two reals. We can
obtain the sum of the squares of the elements of a list L by the expression
reduce(plus, simpleMap(square,L)). That is, simpleMap(square,L) pro-
duces the list of squares, and reduce with first argument plus sums these164 CHAPTER 5. MORE ABOUT FUNCTIONS
squares. We divide this result by n, the length of the list L, to get the average
square. Then we can get the average by reduce (plus,L)/n, and we can apply
square to get the square of the average. The necessary functions, assuming
that reduce and simpleMap are as previously defined, are shown in Fig. 5.16.
(1) fun square(x:real) = x*x;
val square = fn : real -> real
(2) fun plus(x:real,y) = x+y;
val plus = fn : real * real + real
fun variance(L) =
let
3) val n = real(length(L))
in
a) reduce(plus,simpleMap(square,L))/n -
6) square(reduce(plus,L) /n)
end;
variance = fn : real list + real
(6) variance([1.0, 2.0, 5.0, 8.0]);
val it = 7.5 : real
Figure 5.16: Computing the variance using higher-order functions
In lines (1) and (2) of Fig. 5.16 we define the functions square and plus.
Then we see the definition of function variance. At line (3) it computes n, the
list length, which is a common subexpression, as a real number. Recall that
ML provides a built-in function Length at the top level, to compute the length
of a list (as an integer), as well as a function real to convert an integer to an
equivalent real number. Lines (4) and (5) are Formula (5.1)
Finally, in line (6) we see a use of the function variance on the list of
elements (1,2,5,8]- Here, n = 4. The sum of the squares is 1+4+25+64 = 94,
so the average square is 94/4 = 23.5. The average element is 4, so the square
of the average is 16. Since 23.5 — 11 7.5, the variance is 7.5, as we see in the
ML response.
Another way to compute the variance is to take the average of the squares
of the differences between the elements and the average. In this case, we would
average (1—4)?, (2—4)?, (5-4), and (8—4)?, or (9+4+4+1416)/4=7.5. 05.4. HIGHER-ORDER FUNCTIONS 165,
5.4.4 Converting Infix Operators to Function Names
We might expect that we could use the operator + in place of the function
plus of Example 5.25. For example, can we write reduce(+,L) in line (5) of
Fig. 5.16. Should we do so, we get the error message:
Error: expression or pattern begins with infix identifier: +”
The problem is that ML, like most languages, defines the usual arithmetic
operators to be infix. That is, they appear between their operands. However,
the function F in the definition of reduce is expected, as are all functions, to
precede its operands
To allow an infix operator to be used as the name of a function, we precede
it by the keyword op. For example, we may write
op + (2,3);
val it = 5: int
In effect, op + is the same function as the function plus defined in Fig. 5.16,
except that the latter is restricted to reals and the former needs to have its
parameter type determined. As another example, line (5) of Fig. 5.16 can be
written
square(reduce(op +, L)/n)
with no change in the behavior of the program.
5.4.5 The Function Filter
Another useful higher-order function is £ilter, which appears in Fig. 5.17. This
function takes a predicate P and a list L, and produces the list of elements of
L that satisfy the predicate P. In line (1) we see the basis case: if the list L is
empty then filter produces the empty list regardless of P. Lines (2) through
(4) cover the inductive case. We test at line (3) whether P(c) is true for the
head element z of the list L. If so, the resulting list is x followed by whatever
we get by filtering the tail of the list with predicate P. On line (4) we see that if
P(z) is false then z is not selected and the result is whatever we get by filtering
the tail.
Notice the type of filter. It has two parameters, the first of which is a
function of type ’a -> bool. This type indicates that the argument corre-
sponding to the first parameter of filter can be a predicate with any domain
type. The second argument is a list of elements of the type ’a to which the
predicate applies. The result of filter is another list of elements of this type.
In line (5) of Fig. 5.17 we see an example of the use of filter. The first
argument is a description of the boolean-valued function that is true when its
argument is greater than 10. We use the keyword fn and a one-pattern match
to describe this function. The second argument is a list of integers, and the
result is those integers greater than 10, in the order of their occurrence on the
ict166 CHAPTER 5. MORE ABOUT FUNCTIONS
(1) fun filter(P,nil) = nil
(2) | filter(P,x::xs) =
3) if P(x) then x::filter(P,xs)
«@) else filter(P,xs);
val filter = fr: ('a > bool) * 'a list + a list
(5) filter(én(x) => x>10, [1,10,23,5,16]);
val it = [23,16] : int list
Figure 5.17: The function filter
5.4.6 Exercises for Section 5.4
* Exercise 5.4.1: Write a function tabulate that takes as arguments an initial
value a, an increment 6, a number of points n, and a function F from reals to
reals. Print a table with columns corresponding to values x and F(z), where
r=a,at+d,a+2%,...,a+(n—1)6.
Exercise 5.4.2: Simpson’s rule is a more accurate way to integrate functions
numerically. If we evaluate a function F at 2n +1 evenly spaced points,
a,a+d,a+26,...,a+ 2nd
then we may estimate the integral f°*?"* F(«)de by
5( F(a) + 4F(a +6) + 2F(a + 25) + 4F(a-+ 36) + 2F(a+ 45) +---
+2F (a+ (2n ~ 2)6) + 4F(a+ (2 ~ 16) + F(a + 2nd) /3
That is, the even-position terms all have a coefficient of 4, while the odd position
terms have coefficient 2, except for the first and last, which have coefficient 1.
Write a function simpson that takes starting and ending points a and 6, an
integer n (such that the evaluation is to use 2n + 1 points as above), and a
function F to integrate by Simpson's rule. ‘Try out your function on polynomials
*, 2°, and so on. What is the smallest integer i such that Simpson's rule fails
to get the exact integral of z‘ with a = 0.0, b = 1.0, and n = 1?
Exercise 5.4.3: When implementing either the trapezoidal rule or Simpson's
rule, it is possible to compute d once and for all, rather than at each recursive
call (although as explained in the text, this strategy may cause roundoff errors
to accumulate). Reimplement
* a) The function trap of Fig. 5.145.4, HIGHER-ORDER FUNCTIONS 167
b) Your function simpson from Exercise 5.4.2.
in such a way that 6 is computed once
Exercise 5.4.4: Improve the function trap of Fig. 5.14 by printing an appro-
priate error message and then raising an exception when the input is bad (as
detected by line (1) of Fig. 5.14).
Exercise 5.4.5: Use the function simpleMap(F,L) to perform the following
operations on a list L.
* a) Replace every negative element of a list of reals by 0, leaving nonnegative
elements as they are.
b) Add 1 to every element of an integer list.
* c) Change every lower-case letter in a list of characters to the corresponding
upper-case letter. Do not assume that only lower-case letters appear in
the list.
1d) Truncate each string in a list of strings so it is no more than 5 characters
long. That is, delete the sixth and subsequent characters while leaving
shorter strings alone.
Exercise 5.4.6: Use the function reduce to perform the following operations
on a list L.
* a) Find the maximum of a list of reals.
b) Find the minimum of a list of reals
* c) Concatenate a list of characters (je., the function implode).
d) Find the logical OR of a list of booleans.
Exercise 5.4.7: Use the function filter to perform the following operations
on a list L.
* a) Find those elements of a list of reals that are greater than 0.
b) Find those elements of a list of reals that are between 1 and 2.
*1 c) Find those elements of a list of strings that begin with the character #"
! d) Find those elements of a list of strings that are at most 3 characters long.
! Exercise 5.4.8: What is the effect on a list L of reduce(op -, L)?
*! Exercise 5.4.9: Write a function Lreduce that takes a two-parameter function
F and a list [a1,a2,---,@n] and produces168 CHAPTER 5. MORE ABOUT FUNCTIONS
F(-- F(F(a14@2), 49) +n)
‘That is, this function is like reduce, but it groups the elements of the list from
the beginning of the list instead of the end.
Exercise 5.4.10: What is the effect of lreduce(op -, L)?
* Exercise 5.4.11: Another version of reduce takes a basis constant g of some
type ’b, a function F of type ’a * ’b -> ’b, anda list of elements of type *a.
The result applied to a list [a1,a2,.-.,4n] is
F(a1-++F(an-1, F(an9)) “*)
Write a function reduceB that performs this operation.
* Exercise 5.4.12: Use the function reduceB from Exercise 5.4.11 to
1a) Compute the length of a list.
11 b) Compute the list of suffixes of a list. For example, given the list (1,2,3],
produce ([1,2,3], [2,3], [3], nil].
*! Exercise 5.4.13: Another use of polymorphic functions is to allow late binding
of overloaded symbols such as + or *. That is, instead of using these symbols in
a function f, we invent names for them such as plus and times, and we let these
names be parameters of the function f. Then, we can call f with appropriate
definitions for the parameters, thus binding the names to the correct meanings
as late as possible. As an exercise:
a) Write a function eval that takes as parameters functions representing
scalar addition and multiplication, as well as taking a polynomial (rep-
resented as a list in the manner of Section 3.6) and a value at which to
evaluate the polynomial.
b) Show how to call your function from (a) to evaluate the integer polynomial
4a + 32? + 22 + 1 at the point x = 5.
5.5 Curried Functions
Until now, we have considered only functions that have a single parameter,
although that parameter often is a tuple written with parentheses and commas.
Thus, we have written many ML functions that looked like multiparameter
functions of languages like C or Pascal. Technically, these ML functions really
have a single parameter, of a product type, but in practice there is little harm
in pretending they are ordinary multiparameter functions.
However, ML provides a more general way to connect a function name to
its parameters or arguments. It is sometimes useful to express multiparameter5.5. CURRIED FUNCTIONS 169
functions in Curried form,® where the function name is followed by the list of its
parameters, with no parentheses or commas. The following example illustrates
the difference between the Curried and uncurried form of functions. We shall
be introduced to the important advantage of the Curried form when we discuss
partially instantiated functions in Section 5.5.1
Example 5.26: Let us write a two-parameter function that computes x¥. In
lines (1) and (2) of Fig. 5.18 we see such a function exponent1 in the style we
have been using. This function takes a parameter that is a pair consisting of a
real z and an integer y, and returns 2. It is not carefully designed because it
loops forever on a negative integer y.
(1) fun exponenti(x,0) = 1.0
(2) | exponenti(x,y) = x * exponent1(x,y-1)5
val exponent! = fn : real * int + real
(3) fun exponent2 x 0 = 1.0
(4) | exponent2 x y = x * exponent2 x (y-1);
val exponent? = fn : real > int + real
(5) exponent1(3.0,4);
val it = 81.0 : real
(6) exponent2 3.0 4;
val it = 81.0: real
Figure 5.18: Two styles for exponentiation functions
‘The Curried function exponent? in lines (3) and (4) of Fig. 5.18 does exactly
the same computation as the uncurried function exponent. The parameters of
exponent2 are not surrounded by parentheses or separated by commas, either
in the definition on lines (3) and (4) or in the recursive use on line (4).
Lines (5) and (6) show appropriate calls to the two functions. Each computes
3f=81. O
5.5.1 Partially Instantiated Functions
Curried functions are useful because they allow us to construct new functions
by applying the function to arguments for some, but not all, of its parameters.
Named after the mathematician Haskell Curry, who investigated this form of function
Aafinition170 CHAPTER 5. MORE ABOUT FUNCTIONS
Precedence of Function Application
‘The parentheses around y-1 on line (4) are necessary for ML to group
arguments properly. Without parentheses around y-1, the second argu-
ment in the recursive call to exponent? will be regarded as y. Constant 1
will be subtracted from the result of the call, leading to a type error. The
reason for this interpretation is that juxtaposition of expressions, which is
function application in ML, is an operator of higher precedence than the
arithmetic operators.
To begin our exploration of this matter, notice the subtle difference between
the responses to the two functions in Fig. 5.18. ML finds the type of exponent 1
to be a function that takes a pair of type real * int as parameter and returns
areal. However, the type of exponent? is given as real -> int -> real.
Remembering that the -> operator associates from the right, we interpret this
type as real -> (int -> real), that is, a function taking a real as argument
and returning a function from integers to reals.
This type suggests how the function exponent2 is interpreted. In the call of
line (6) in Fig. 5.18, the first argument, 3.0, is given to the function exponent2,
resulting in a new function g. This function, of type int -> real, takes an
exponent y as its argument and produces the result g(y) = 3¥. The function g
is a value in its own right and can, under the right circumstances, be isolated
and bound as the value of an identifier.
‘The process of forming new functions by binding one or more of the parame-
ters of an existing function is called partial instantiation. In the general mathe-
matical setting, we can take a function f of n arguments, say f(21,22,...,2n).
We bind the first k of those arguments to constants a; ,a@2,...,a% to form a new
function, which we may call fa;,ap,...,a¢(k+1Tk42)-++52n)- The definition
of function fay,a2,...,az is as expected:
Fay,a2,..., a4 (e+) Th425---52n) = F(d1,02,.-- Ak Thy, Th425-- Ln)
In ML, Curried functions can be partially instantiated by applying them to
values, one for each of the first k parameters. It is only possible to instantiate
the parameters from the left, not in any order.
Example 5.27: Having made the definition of exponent? in Example 5.26,
we can proceed to create a new function by instantiating its first argument. An
example is
val g = exponent2 3.0;
val g = fn: int + real5.5. CURRIED FUNCTIONS 7
Now g is a function that takes an integer y as argument and returns 3¥.
Figure 5.19 suggests what has happened. Identifier g has been bound to a
value that is a notation representing the function exponent2 applied to 3.0.
Use —
‘ with first
argument 3.0
Code for
exponent2
exponent2
Figure 5.19: Partially instantiating a function
We can use g like any other function if we provide its proper argument. For
example,
g4s
val it = 81.0
applies g to the integer 4, producing 34, or 81. 0
Here are a number of points about partial instantiation and the Curried
form of functions.
© Note that we are not restricted to the no-parentheses form. We could have
written g(4) instead of g 4, and we could have defined new functions from
exponent2 with parentheses. For instance,
val h = exponent2(10.0)
makes h a function that computes powers of 10.
© As discussed in Section 3.1.5, the value of the function g does not change
if we define a new function called exponent2, because the definition of 9
refers to the specific value shown in Fig. 5.19.172 CHAPTER 5. MORE ABOUT FUNCTIONS
‘¢ We can partially instantiate a function by binding arguments other than
in left-to-right order of appearance, but to do so we need to define a new
function using fun. For example, we could bind the second argument of
exponent? to the value 3, producing a function that cubes a real number,
by:
fun cube x = exponent2 x 3;
This approach to partially instantiating functions works even if the func-
tion definition was not written in Curried form.
5.5.2. The ML Style of Function Application
As we learned in Example 5.26, parentheses around arguments of ML functions
are optional in many cases. The only time they are essential is when the argu-
ment has components that are kept together with an operator whose precedence
is below that of function application. Unfortunately, function application is al-
most the highest-precedence operator of all. Thus, for instance, the following
give us errors.
1. fun f c:char = 1.0 is grouped (f c):char = 1.0. To ML, it appears
as if we are trying to say that the result of f is a character, when in fact
it is defined to be a real. Thus, we need to write fun f(c:char) = 1.0.
2. fun f x::xs = nil is grouped (f x)::xs = nil, which is not likely to
be what we intended. We need to write fun f(x::xs) = nil,
3. print Int.toString 123 is grouped (print Int.toString) 123 and
leads to a type error since Int. toString (the function that converts
integers to strings from the structure Int) is not itself a string, which
the function print requires. Thus, we need one pair of parentheses:
print (Int.toString 123).
On the other hand, there are many places where the parentheses around
arguments are superfluous, and we shall start omitting parentheses in safe sit-
uations. Here are some of the places where we can avoid parentheses.
1. £ [1,2,3] or even £[1,2,3] means the same as f([1,2,3]). In general,
if the argument of a function is already bracketed so it cannot be split
apart by the function application, then no parentheses are needed.
2. chr 100 means the same as chr (100). In general, operators are functions,
and the same rules as apply to functions apply to operators.
3. open TextI0 is not only permitted, it is necessary. The keyword open is
not a function, and we cannot put parentheses around the structure name
when we open a structure like TextI0.“
5.5. CURRIED FUNCTIONS ms
5.5.3 Exercises for Section 5.5
Exercise 5.5.1: Write, in Curried form, a function applyList that takes a
list of functions and a value and applies each function to the value, producing
a list of the results.
1 Exercise 5.5.2: Write, in Curried form, a function makeFnList that takes a
function F whose domain type is D and whose range type R is a function type
T, + T». The result of makeFnList is a function G that takes a list of elements
[di,dp,...,dy] of type D and produces a list of functions [fi fo,-.., fal of type
R, such that f; = F(di).
Exercise 5.5.3: Write a function substring, either Curried or not, that takes
two parameters and tests whether the first is a substring of the other. String ¢ is
a substring of string y if we can write y as the concatenation of strings w, 7, and
2, It is permissible for any of the strings to be empty. For example, "abc" has
substrings including , and "ab". Using makeFnList of Exercise 5.5.2,
construct a function f that takes a list of strings [s1, 82,...,8n] and produces a
list of functions [F,, F2,..., Fy], such that F(z) tells whether s; is a substring
of a.
Exercise 5.5.4: From f of Exercise 5.5.3, create a list of functions that, re-
spectively, check whether one of the words "he", "she", "her", "his" is a
substring of a given string.
Exercise 5.5.5: Apply your list from Exercise 5.5.4 to the string "hershey",
using function applyList from Exercise 5.5.1. What is the result?
Exercise 5.5.6: Repeat Exercise 5.5.3 for subsequences in place of substrings.
String x is a subsequence of string y if x is formed by striking out zero or
more positions of y. For example, "ac" is a subsequence of "abe" but is not a
substring. Then, as in Exercise 5.5.4, create a list of functions that test whether
the following strings are subsequences of a given string:
[ear", "part", "trap","seat"]
Finally, apply your list of functions to the string "separate"
Exercise 5.5.7: It is actually quite easy to convert an n-parameter function,
for fixed n, from Curried to uncurried form. Write the following higher-order
functions that perform the translations.
* a) Given a function F that takes one parameter whose type is a product type
with n components, the function curry applied to F produces a function
G that takes n arguments in Curried form. G a1 x2 --- tq produces the
same value as F(21,22,-.+2n)-
b) Given a Curried function F that takes n parameters, the function uncurry
applied to F produces a function G that takes one parameter that is a
tuple with n components. G(r1,22,...,2,) produces the same value aS
Fay tz ++ tn.14 CHAPTER 5. MORE ABOUT FUNCTIONS
5.6 Built-In Higher-Order Functions
ML provides certain higher-order functions in the top-level environment. In
several cases these functions are similar to functions such as simpleMap and
reduce that we studied in Sections 5.4.2 and 5.4.3, respectively. In this section,
we shall introduce these functions and their use. We shall also give definitions
of these built-in ML functions in terms of simpler constructs, both to help the
reader see the meaning of these functions and to illustrate some useful funetion-
writing ideas.
5.6.1 Composition of Functions
We shall now study a problem that is of intrinsic importance and that also
encourages us to view functions as values disembodied from any arguments to
which they might be applied. The composition of functions F and G is that
function C such that for any argument x, C(x) = G(F(z)).
Example 5.28: Let F(x) = c+3, and let G(y) = y? +2y. Then the composi-
tion of F and G, or G(F(x)), is (x +3)? +2(x +3), or 2? +82 +15. We get this
formula by substituting F(x) for y in the formula for @ and then expanding
the formula.
We can define a higher-order function comp that takes two functions as
arguments and applies them to a third argument. The ML code is simple:
fun comp(F,G,x) = G(F(x));
val comp = fn: (‘a> 'b) *(b> 'c) *'a 4c
Notice the type of this function. First, recall that * takes precedence over
->, so the type expression is grouped
(Ca -> ’b) * (b -> ’c) * ’a) -> 'o
Thus, the function comp has three parameters, the first of which (F) is a function
from some type ’a to some (possibly different) type ’. The second parameter,
G, takes a value of the type ’b and produces a value of some (possibly different)
type ’c. The third parameter is of the type ’a to which F applies, and the
result is of the type ’e that G produces.
Example 5.29: We can use comp to compute the composition of the two
functions from Example 5.28 on a particular value of 2, for instance:
comp(fn x => x+3, fn y => ysy+2ey, 10);
val it = 195 : int
Here we have defined the first argument of comp to be the function 2 +3 and the
second to be the function y? + 2y. The composition of these functions, which
we discovered in Example 5.28 was the polynomial x? + 8x + 15, is then applied
to 10, and produces the correct result, 10? +8 x 10+15=195. O5.6. BUILT-IN HIGHER-ORDER FUNCTIONS a
5.6.2 The ML Operator o For Composition
However, Example 5.29 is somehow unsatisfactory. It is true that we can apply
the composition of any two functions to an argument, as long as the types
match properly. Yet we cannot address the question of Example 5.28: “what
function is the composition of functions x + 3 and y? +2y?” Function comp as
we defined it is relatively useless. It is a “shorthand” for G(F(z)), but it even
fails to save us keystrokes.
What we really want is a function that takes only the two functions F and G
as its arguments and produces the function C that is the composition of F and
G. For instance, in the case of Example 5.28, we would like the composition
function to return the function x? + 8x + 15 itself, rather than returning the
value of this function for a particular value of x. In ML, there is an operator 0
(lower-case “Oh”) that composes functions.
Example 5.30: If we defined
fun F x = x43;
yey + 2ay;
then the function G(F(z)) = 2? + 8x +15 can be obtained by
fun Gy
val H=GoF;
which makes H the desired function. ©
Let us write a function comp that behaves like the ML operator o, but our
function will not be an infix operator, as o is. A useful technique for defining
higher-order functions is to describe, within a let-expression what the effect of
the function is supposed to be, giving the function so described a name, say f.
‘Then, between the in and end place f by itself.
For the function comp, we use a let-expression to define, in terms of a pa-
rameter z, what the function that is the composition of F and G does. The
expression that follows the keyword in is just the name of the defined function.
The proper definition appears in Fig. 5.20.
Line (2) defines a function C to have the desired behavior; it is the compo-
sition of F and G. In line (3) we see that the value of the function comp, which
is what we are defining with the let-expression, is the function C itself. The
type of comp confirms that we are on the right track. It takes two arguments:
1. A function F from some type ’a to some type ’b, and
2. A function G from type ’b to some type *e.
The result of comp is a function of type ’a -> ’c, that is, a function from type
a to type °c.” This function is the composition of F and G.
TTo parse this type expression, remember that ~> groups from the right. Thus, the proper
grouping is (’a => "b) -> (CB => 'e) => Ca => 7e))176 CHAPTER 5. MORE ABOUT FUNCTIONS
(1) fun comp F G =
let
(2) fun C x = G(F(x))
in
(3) c
end;
val comp = fn: ('a— 'b) + (b> 'c) 4 “a> "ec
(4) fun F x = x+3;
val F = fn : int + int
(5) fun G y = yeys2eys
val G = fn: int + int
(6) val H = comp F G;
val H = fn: int + int
(7) H 10;
val it = 195 : int
Figure 5.20: Computing the composition of two functions
Next, we see in Fig. 5.20 a definition of the function F to be x +3 and the
function G to be y? + 2y. Then we define the function H to be comp F G, that
is, the composition of F and G. We now have a name H that we can use to
refer to the function that is the composition of F and G, that is, the function
whose expression as a polynomial is 2? + 8x + 15. This function can be applied
to any integer argument; we show it in Fig. 5.20 applied to argument 10.
5.6.3 The “Real” Version of Map
As we mentioned in Section 5.4.2, the top-level environment of ML has a func-
tion map that is similar to the function simpleMap that we defined there. How-
ever, instead of taking both the function and list as arguments, map takes only
a function F as an argument. The result of map is a function that takes a list
of elements and returns the list that is the result of applying F to each element
of the list.
Figure 5.21 is a definition of the ML built-in function map. Of course, this
definition is unnecessary, as one can use map in programs without it. In line (1).5.6. BUILT-IN HIGHER-ORDER FUNCTIONS 177
Defining comp Via Currying
The same function comp that we constructed in a let-expression in Fig. 5.20
can also be written as a 3-argument Curried function:
fun compC F G x = G(F(x));
It may seem strange that a function that takes x as an argument could be
the same as the two-argument function comp in Fig. 5.20. However, the
two functions have the same type, and they behave the same way. For
example, compC F G is surely a function of «, while comp F G seems to
have no argument. However, since the result of comp F G is a function, it
can be applied to an argument of the domain type of G, just like compC F G
can. That is, an expression like
comp (fn x=>x+3) (fn y=>yxy+2*y) 10
makes sense, and gives the answer 195 that we saw in Fig. 5.20, even
though comp was not defined to have a third argument.
we see that map takes one argument, a function F. In a let-expression, we define
a function M that takes a list and applies F to each element. Line (2) says that
M applied to the empty list is the empty list. Line (3) says that for nonempty
list, M applies F to the first element and calls itself recursively on the tail to
apply F to the remaining elements. Finally, at line (4) we say that this function
M iss the result of map when it is applied to F.
(1) fun map F =
let
(2) fun M nil = nil
(3) | MGcrixs) = Fox rt Moxs
in
(4) M
end;
val map = fn: (a>) + “alist + °b list
Figure 5.21: The ML function map
Notice the type of map in the ML response. Remembering that -> groups
from the right, this type is (?a -> ’b) -> (a list -> ’b list). That is,
nap is a function that takes as its argument a function F from type ’a to type
7. Then, map returns a function M that takes a list of elements of tvne a and178 CHAPTER 5. MORE ABOUT FUNCTIONS
Representing a Composition
It is useful to consider the way the values of F, G, and H are represented
by ML, if H is the composition of F and G. Since functions are defined by
code, the identifiers F and G are bound to a value that is code. However
H, being defined by a composition, is bound to a value that is a notation
saying it is the composition of F and G. The situation is suggested in
Fig. 5.22.
Note that the value for H refers to the values bound to F and G
in the current environment, not to the names F and G. The distinction
becomes important if we bind identifier F or G to a new value. Since H
is bound to the particular environment entries suggested in Fig. 5.22, and
entries in an environment do not change their values, the value of H does
not change.
produces a list of elements of type *b.
Composition
ofy and
G Code for G F Code for F
Figure 5.22: Representing function values in an environment
Example 5.31: If map is as defined in Fig. 5.21, and square is the function
that squares reals, then map(square) is the function that takes a list of reals
and squares each one. We could create this function by
val squareList = map square;
val squareList = fn : real list + real list
‘Then, we can use this function as
squareList [1.0, 2.0, 3.0];5.6. BUILT-IN HIGHER-ORDER FUNCTIONS 179
val it = [1.0,4.0,9.0] : real list
to square each element of a particular list. 0
5.6.4 Folding Lists
ML provides the user a pair of functions called foldr and fold1. Both functions
perform a variety of the fold operation, which takes a list L = [a,a,...,an] and
treats each element a; as if it were a function; call this function F,,. When we
apply a folding operation to L, we construct the function that is the composition
of all the functions Fa,,Fay,-++; Fay, that is, Fa, © Fay 0---0 Fay
Example 5.32: Many operations on lists can be specified by folding, using an
appropriate definition of the functions Fy, and also choosing the right constant
to which the composition of functions is applied. For instance, suppose L
[a1,a2,...,an] is a list of integers, and the function Fy, is the function that
multiplies its argument by a;. Then the function Fa, ° Fa, 0---0F,, multiplies
its argument by the product of the elements of the list L, that is, ay x@2x---xdn.
If we apply this function to 1, we can compute the product of the elements of
a list.
‘As another example, suppose instead that Fs, is the function that adds 1 to
its argument, regardless of what a; is. Then the function Fy, 0 Fa, 0--- 0 Fa,
adds n to its argument. In particular, applied to 0 it computes the length of
the list L. Thus, folding also lets us define the length function if we correctly
specify the functions Fy. 0
The missing element in Example 5.32 is the method of going from a; to the
proper F,,. In effect, we need to reverse the effect of partial instantiation of
these functions, by writing one function F(a, z) such that F(a, x) equals Fa(x)
for all a and x.
Example 5.33: Let us consider the two problems in Example 5.32. If Fa, is
to multiply its (integer) argument 2 by a;, then we want F(a,z) = az, or in
ML:
fun F(a,x) = a*x;
val F = fn: int * int + int
In the second problem, where we want each Fy, to add one to its argument, we
define F(a,2) = 2 + 1, or:
fun F(a,x) = xt;
val F = fn: 'a * int -> int180 CHAPTER 5. MORE ABOUT FUNCTIONS
inML. 0
The difference between foldr and foldl is that foldr composes the func-
tions Fa, starting from the end (ie., from the right), and fo1d1 composes them
starting from the left. That is, foldr, given list [a,,a2,...,@n] and initial value
b, computes
Fa; (Faa (+++ (Fan (6))--+))
while fo1d1 computes
Fay (Fay -1 (+++ (Fay (8)) --*))
A definition for function foldr is shown in Fig. 5.23. Again, let us emphasize
that foldr is a primitive of ML, and we do not need to define it. However,
seeing a definition in terms of more elementary operations is instructive. The
definition of fo1d1 is similar, and we leave it as an exercise.
(1) fun foldr F y nil = y
(2) | foldr F y (x::xs) = F(x, foldr F y xs);
val foldr = fn: (a*) > b> b> ‘alist > %
Figure 5.23: Definition of folér
We define foldr in Curried form, with three parameters.
1. Function Fis of type ’a * ’b -> °b. Type ’ais the type of list elements,
and ’b is both the range type of F and the type of the result of applying
foldr.
2. Value y is of type *b. It is the initial value associated with the empty list.
3. List L is a list of elements of type ?a
Line (1) of Fig. 5.23 covers the case of an empty list. Then we just return
the initial value y. Line (2) covers the inductive case, where the list L =
[a1,a2,...,a,] has a head element x and a tail xs. That is, x is a and xs is
[a2,...,@n). To compute the result we do the following:
a) Apply foldr to function F, the initial value y, and the tail of the list.
The result is computed recursively by applying the functions
Fans Fanais+++s Fas
associated with the elements of the tail, in turn, to the initial value y.5.6. BUILT-IN HIGHER-ORDER FUNCTIONS 181
b) Apply the function F to the list head 2 and the result of (a). This
step has the effect of composing function F,, with the other functions
Faz;--+,Fa, that have already been applied to y. As a result, all the
functions associated with the entire list L are applied to the initial value
y. They are applied in the reverse of the order in which they appear on
the list, with the last element’s function applied first.
Example 5.34: If we want to take the product of the elements of list of
integers L, we can use foldr with a suitable product function, such as that in
Example 5.33 and initial value 1. That is, we may write
val L = [2,3,4];
foldr op * 1 L;
wal it = 24 : int
Note that in order to use the multiplication function as the first argument
of foldr, we need op to make it a prefix operator. The arguments of fold are
grouped as foldr (op *) 1 L.
‘* But beware putting those optional parentheses in exactly that way, be-
cause *) is interpreted as a comment-ender, and an error will result. In
this special case we would need to write fold (op * ) 1 L.
a
Because foldr is defined in Curried form, we can partially instantiate foldr
with a function F and an initial value and get another function that takes a
list L and “folds” L according to F and b.
Example 5.35: We can write a function that takes the product of the elements
on any integer list by:
val prod = fold op * 1;
val prod = fn : int list + int
prod [23,4];
val it = 24 : int
a
5.6.5 Exercises for Section 5.6
Exercise 5.6.1: Use the higher-order functions map, foldr, and fold1 de-
scribed in this section to build the following functions on lists. You should
write anonymous functions that operate on list elements only.182 CHAPTER 5. MORE ABOUT FUNCTIONS
* a) A function that turns an integer list into a list of reals with the same
values.
b) A function that turns an integer list L into a list of reals, each of which
is the absolute value of the element on L.
*1c) The function implode, which turns a list of characters into a single string
with those characters in order.
d) The function concat, which turns a list of strings into the concatenation
of all those strings.
*t) A function that turns a list of integers [a1, a2,...,an] into the alternating
sum a; — a2 +43 ~ q+.
f) A function that computes the logical AND of a list of booleans.
* g) A function that computes the logical OR of a list of booleans.
1h) A function that computes the exclusive or of a list of booleans. The
exclusive or of a1, @2,...,dn is true if an odd number of the a;’s are true
and false if an even number of the a,’s are true.
*
Exercise 5.6.2: Write a definition for the function foldl analogous to the
definition of foldr in Fig. 5.23. Hint: Recompute the initial value in the
recursion.
Exercise 5.6.3: Since the function comp of Fig. 5.20 was written in Curried
form, we can bind the first argument F to a function to get a new function that,
takes a function G as argument and produces the function G o F. However, if
we write an expression such as
val I = comp (fn x => x+3);
we get an error message saying there is a “nongeneralizable type variable,”
namely the unknown type that is the range type for I and the function to
which I will be applied.
*1 a) We can fix the definition of I above if we give it a type. If we want the
range type of I to be string, what is a suitable type declaration to add
to the definition of I?
b) Having defined J as in part (a), show how to use J to create a function
that given an integer x returns a string consisting of the digits of z + 3.
©) Use your answer to (b) to create a function that, given an integer z prints
+3.
*! Exercise 5.6.4: Suppose we define comp as in Fig. 5.20 and5.6. BUILT-IN HIGHER-ORDER FUNCTIONS 183,
fun addi x = x+1;
Give the type and value for each of the following functions or constants. To
avoid a nongeneralizable type variable error as discussed in Exercise 5.6.3, you
should declare all unknown types to be integers.
a) val compAi = comp addi;
1b) val compCompA1 = comp compat;
c) val £ = compAl addi;
d) £(2);
Me) val g = compCompA1 compA1;
f) val h = g addi;
g) (2);
! Exercise 5.6.5: Repeat Exercise 5.6.4 for the following expressions. The func-
tions compA1 and compCompA1 are as defined in Exercises 5.6.4(a) and (b). How-
ever, you should redeclare their type for the uses described below.
a) val £ = compAi real; where real is the built-in function that converts
integers to equivalent reals.
b) val compT = comp trunc; where trunc is the built-in function that con-
verts a real to an integer, rounding towards 0 if necessary.
!c) val g = compCompA1 compT;
d) val h = g real;
e) £(2);
f) n(3.5)5
g) (73.5);
Exercise 5.6.6: Write a version of the function filter from Section 5.4.5
that takes only a predicate P as argument and produces a function that takes
a list of elements of suitable type and returns those elements on the list that
satisfy P.
!! Exercise 5.6.7: Using foldr and an anonymous function, write a function
that takes a list of reals [ao,a1,.-.,@n—1] and produces a function that takes
an argument b and evaluates the polynomial
dg + ay +092? +--+ + aye"?
at x = 6; that is, it computes 7"j ajbi.
! Exercise 5.6.8: Write the two-argument function simpleMap of Section 5.4.2
in Curried form. Show that its behavior is exactly the same as that of the
one-argument function map of Fig. 5.21.184 CHAPTER 5. MORE ABOUT FUNCTIONS
5.7 Case Study: Parsing Expressions
In this section we shall look at one of the fundamental parts of a compiler, a
parser for expressions. In so doing we shall review some of the ideas introduced
in this chapter: case statements and exceptions. We also see an interesting ap-
plication of several “old” ideas: the lookahead operator for reading an instream,
a mutual recursion involving five functions, wildcards in patterns, and the use
of statement lists, including the addition of a final “statement” to return the
proper value.
The problem we shall address is how to read arithmetic expressions from the
input and compute their value. The operands of these arithmetic expressions
are integers, and the operators used are *, +, -, and /.8 For simplicity, we
assume there are no blanks or other white space between characters of the
expression. It is easy to ignore white space or other characters should we wish;
the case study of Section 4.4 showed how.
5.7.1 The Grammatical Structure of Arithmetic Expres-
sions
We shall describe the structure of expressions using a graphical notation equiv-
alent to context-free grammars; it is illustrated in Fig. 5.24. Each syntactic
category is named on the left; we have in Fig. 5.24 four syntactic categories:
INTEGER, ATOM, TERM, and EXP (expression). A syntactic category repre-
sents a set of sequences of elements. Each element can be a string of characters
or it can be another syntactic category. The possible instances of a syntactic
category are indicated by the possible paths from the left end of its graph to
the right end; each path represents a sequence of elements that is an instance
of the syntactic category named at the left.
For instance, in Fig. 5.24(a) the graph for INTEGER requires us to go
through DIGIT once. Then we can continue to the right end or we can cycle
back to pass through DIGIT any number of additional times. That is, an
INTEGER is a string consisting of one or more DIGIT’s. The syntactic category
DIGIT is not defined by a graph, but we define it to consist of any of the digits
0 through 9.
In Fig. 5.24(b) we see that an ATOM is defined to be either an INTEGER
or a sequence consisting of a left parenthesis, an EXP, and a right parenthesis
That is, an ATOM is an integer or a parenthesized expression.
Next we see that a TERM is an ATOM followed by zero or more additional
elements, each element consisting of either a multiplication or a division sign,
followed by another ATOM. That is, a TERM is a sequence of one or more
ATOM’s separated by multiplication and/or division signs. Similarly, an EXP
is a sequence of one or more TERM’s separated by plus and/or minus signs.
SOf course, ML would use div for integer division, but that is unimportant because we are
not reading ML programs in this example,5.7. CASE STUDY: PARSING EXPRESSIONS 185
(a) INTEGER a7 7
(
(—-+Gaxe) +0)
(9 TERM 7
We
C)
(@ EXP 7
Ni
©
Figure 5.24: The structure of arithmetic expressions
5.7.2 Structure of the Parsing Program
Figures 5.25 and 5.26 together show a program that computes the value of an
expression.” Its functions carefully choose between the lookahead operator and
the inputi operator to make sure that input characters are consumed only at
the appropriate time.
Let us examine the statements and functions in Fig. 5.25. Line (1) opens
the TextIO structure, which we shall need to perform input operations on the
file. Line (2) is the definition of an exception Syntax that will be raised when
an ill-formed input is found. Line (3) is the familiar function digit that tests
whether a character is a digit.
®This program implements a parsing method called “recursive descent.” ‘The reader may
consult Compilers: Principles, Techniques, and Tools, A. V. Aho, R. Sethi, and J. D. Ullman,
‘Addison-Wesley, Reading MA, 1986 for an explanation of this technique.186
CHAPTER 5. MORE ABOUT FUNCTIONS
Then come six functions — the last five are mutually recursive — that
collectively implement the diagrams of Fig. 5.24. Each takes a parameter IN
that is the instream on which the expression appears. Some functions take an
additional parameter that helps the function return the integer value of some
portion of the input. This additional parameter, if present, will be referred to
as “the parameter” or “the argument” of the function, even though IN is also
a parameter
We shall introduce the functions, and then later return to the details of their
implementation. The six functions, each of which returns the value of whatever
input it consumes, are:
1
integer consumes whatever prefix of the current input is a sequence
of digits. The parameter i is the integer value of any digits that have
been seen and consumed on the input immediately before the current call
to integer. The result returned is the value of the digits seen so far
(as represented by é) and any further digits found consecutively on the
input. An initial call to integer with i = 0 implements the diagram of
Fig. 5.24(a) and returns the value of the digits read.
atom looks on the input for either
(a) An integer or
(b) A left parenthesis, followed by any expression, followed by a right
parenthesis.
It thus implements the diagram of Fig. 5.24(b).
. term looks for an atom on the input and, after consuming one, calls
termTail. The argument i for termTail is the value of the atom found.
‘Together, term and termTail implement the diagram of Fig. 5.24(c).
. termTail looks for and consumes from the input zero or more groups
consisting of a * or / sign and an atom. It takes a parameter i, which
is the value of all atoms found so far, multiplied or divided as dictated
by the signs that connect them. If termTail finds a * or / as the next
input character, it consumes it and calls aton. The value i is multiplied
or divided, as appropriate, by the value of the atom found. The result
becomes the argument of a recursive call to termTail. If termTail does
not find a * or / as the next input character, it does nothing but return
its argument i.
expression looks for a term on the input and, after consuming one, calls
expTail with argument i equal to the value of the term found. Together,
expression and expTail implement the diagram of Fig. 5.24(d).
expTail looks for and consumes from the input zero or more groups
consisting of a + or - sign and a term. It takes a parameter i, which5.7. CASE STUDY: PARSING EXPRESSIONS 187
(1) open TextI0;
(2) exception Syntax;
(3) fun digit(c) = (#"0" <= c andalso c <=
(4) fun integer(IN,i) =
>) case lookahead(IN) of
©) SOME c =>
mM if digit(c) then (
(8) input1(IN); (* consume character c *)
9) integer (IN, 10*i+ord(c)-ord(#"0"))
)
(10) else i (* if c is not a digit, return i
without consuming input *)
qt) NONE => i (x ditto if end of file is reached *)
(12) fun atom(IN) =
(13) case lookahead(IN) of
(a4) SOME #"(" => (
(15) inputi(IN); (* consume left paren *)
let
(16) val e = expression(IN)
in
(17) if lookahead(IN)=(SOME #")") then
¢
(18) input1(IN); (* consume
right parenthesis *)
(a9) e (* return expression *)
)
(20) else raise Syntax
end
)
|
(21) SOME ¢ =>
(22) if digit(c) then integer(IN,0)
(23) else raise Syntax
|
(24) NONE => raise Syntax
and
Figure 5.25: Parser for arithmetic expressions (beginning)188 CHAPTER 5. MORE ABOUT FUNCTIONS
is the value of all terms found so far, added or subtracted as dictated
by the signs that connect them. Its operation is analogous to that of
termTail.
5.7.3 Detailed Explanation of the Parser Code
Let us now examine the code of Fig. 5.25 in more detail.
‘The Function integer
Function integer isin lines (4) through (11). Line (5) uses lookahead to obtain
a character-option from the instream, That is, the expression lookahead (IN)
evaluates to either NONE if there is no more input, or SOME c, if c is the first
character remaining on the input. Nothing is consumed from the input, how-
ever.
Lines (6) through (10) cover the case when a character c is present. Line (7)
tests if c is a digit, and if so, this digit is consumed from the instream at line (8).
‘At line (9) integer is called recursively with an argument that represents the
effect of appending the digit ¢ to the integer read so far. The formula used on
line (9) is the same one that was used in Fig. 4.10. Line (10) covers the case
when ¢ is not a digit. In this case, the integer i has ended, and its value is
returned by integer.
The second and final case of the case statement is handled by line (11).!°
Here, there is no more input, so the integer i has ended and is returned. Note
that integer is only called on line (22), where we have already determined that
at least one digit waits on the input. Thus integer properly implements the
diagram of Fig. 5.24(a), which requires that at least one digit be consumed.
The Function aton
The function atom appears in lines (12) through (24). A case statement starts on
line (13), and the first case is when the next character waiting on the instream
is a left parenthesis. If so, then the input must begin with a parenthesized
expression. Line (15) consumes the left parenthesis from the input, and line (16)
calls expression to consume an expression from the input. At this point we
expect a right parenthesis to follow, and line (17) checks the right parenthesis
is there. If so, the parenthesis is consumed from the input at line (18), and
the value of the expression between the parentheses is returned at line (19).
However, if after reading the expression at line (16), the following character is
not a right parenthesis, then there is a syntax error, and the exception Syntax
is raised at line (20).
© Notice that the reason for the let-in-end construct in lines (16) through
(20) is because we cannot return the value e of the expression read on
‘Notice that we have put the vertical bars separating cases on a line of their own to make
it easier to separate complex cases visually.5.7. CASE STUDY: PARSING EXPRESSIONS 189
line (16) immediately. We need to hold it and check that there is a right
parenthesis following.
Lines (21) through (23) handle the second case, where there is a character
waiting on the input, but it is not a left parenthesis. We test if it is a digit at
line (22), and if so we conclude that the atom is an integer and call integer
to consume the integer from the instream. Using argument 0 in this call is
correct, since there are no previous digits when the initial call to integer is
made. Recursive calls to integer from itself will increase the value of the
argument. If the character is neither a digit nor a left parenthesis, then there
is a syntax error, which is reported at line (23).
The final case is when there is no character on the input. Since we are
expecting an atom, there must be a syntax error, which is reported at line (24).
The Function term
Line (25) in Fig. 5.26 is the entire function term. It first calls atom to consume
an atom from the input and return the value of that atom. This value becomes
the argument of termTail, which multiplies or divides the value of the first
atom by the sequence of zero or more atoms it finds on the input.
We have used a succinct but subtle style in designing the function term. The
call to atom occurs within a call to termTail, so the call to atom is executed first.
We could have separated the two steps more transparently, but less succinetly
by code such as
let val i = atom(IN) in termTail(IN,i) end
The Function termTail
Function termTail appears in lines (26) through (34) of Fig. 5.26. On line (27)
it looks at the next input, which becomes the basis of a 3-way case-statement.
Lines (28) through (30) cover the case where the next character is *. This
character is consumed on line (29). Then atom is called on line (30) to read and
evaluate the next atom on the input. The value of this atom is multiplied by i,
which is the value of the argument to termTail, and this product becomes the
argument in a recursive call to termTail.
Lines (31) through (33) handle the case where the next character is / anal-
ogously to lines (28) through (30). Line (34) handles all other possible values
of the next character. If the next character is other than * or /, the term is
complete. In this case, termTail returns its own argument as the value of the
entire term that has just been seen on the input.
Functions expression and expTail
Finally, lines (35)-(44) are functions expression and expTail. Their workings
are analogous to those of term and termTail, and we omit the details.190
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(a7)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
and
and
and
CHAPTER 5. MORE ABOUT FUNCTIONS
term(IN) = termTail(IN,atom(IN))
termTail(IN,i) =
case lookahead(IN) of
SOME #"*" => (
inputi(IN); (* consume * *)
termTail (IN,i*atom(IN))
)
|
SOME #"
¢
inputi(IN); (* consume / *)
termTail(IN,i div atom(IN))
=i
expression(IN) = expTail(IN,term(IN))
expTail(IN,i) =
case lookahead(IN) of
SOME #"+" => (
imput1(IN); (+ consume + +)
expTail(IN,i+term(IN))
>
|
SOME #"-" => (
inputi(IN); (* consume - *)
expTail(IN, i-term(IN))
>
|
2 a5
(45) val infile = openIn("test");
(46) expression(infile);
Figure 5.26: Parser for arithmetic expressions (end)5.7. CASE STUDY: PARSING EXPRESSIONS 191
An Example Use of the Parser
On line (45) we begin to use the functions we have written. Identifier intitle
is bound to an instream representing the opened file test; this file contains an
expression that we wish to evaluate. A call to expression(infile) on line (46)
results in the value of the expression in file test being returned by ML.
5.7.4 Exercises for Section 5.7
! Exercise 5.7.1: Give the grammatical diagram(s) for the form of real con-
stants of ML as described in Section 2.1.1.
! Exercise 5.7.2: ML allows us to construct values from integers using list-
formation (with square brackets) and tuple-formation (with parentheses). An
example is [(1,2),(3,4)].
a) Give grammatical diagrams for the set of values that can be formed by
these two construction rules. You do not need to enforce the ML require-
ment that list elements have the same type.
b) Implement a parser for this class of character strings. You should read
the input from an instream IN. The response of your parser is a boolean
indicating whether the input is or is not of the proper form. Your program.
should allow white space among the integers, parentheses, commas, and
brackets, but not in the middle of an integer. The entire value will be
terminated by the character $.Chapter 6
Defining Your Own Types
In this chapter we shall learn ways to extend the type system of ML with
user-defined types. There are two ways to make type extensions:
1. Type definitions are shorthands or macros for previously defined type
expressions.
2. Datatype definitions are rules for constructing new types with new values
that are not the values of previously defined types.
6.1 Defining New Types
As in Pascal or C, it is possible to define new types in ML. However, ML has a
more powerful type system (rules for defining types) than these languages. In
ML, types can take one or more type-valued variables as parameters. It is also
possible in ML to create types whose values are built in more complex ways
than is possible in the type systems of most languages.
6.1.1 Review of the ML Type System
Before proceeding, let us review what we know about the type system of ML.
Types in ML are defined recursively, with a basis of primitive types and rules
for constructing more complex types from these.
BASIS: The basic types we have met are int, real, string, char, bool, unit,
exn (exception), instream, and outstrean. In addition, a type variable such
as ’a or ??a can serve in place of a constant type such as int. These variables
represent values of any type or any equality type respectively.
INDUCTION: We can build new types from old types T; and Ty, as follows
1. T, * Ty is a “product” type, whose values are pairs. The first component
of the pair is of type Ti and the second is of type Ts. More gencrally,194 CHAPTER 6. DEFINING YOUR OWN TYPES
T, * Ty * ++. * Ty is the type for a tuple of n components, the ith
component of which is of type Tj, for all i= 1,2,...,n.
2. T, -> Tp is a “function” type, whose values are functions with domain
type T; and range type Tp.
3. We may create new types by following a type such as T; by certain iden-
tifiers that act. as type constructors. So far, we have met:
(a) The list type constructor. That is, for every type Ty, there is
another type T, list, whose values are lists all of whose elements
are of type Ty.
(b) The option type constructor. For every type T; there is a type
T, option whose values are NONE and SOME zx where z is any value
of type Th.
We shall meet additional type constructors ref, array, and vector later.
In this section we learn that the user can define any identifier to be a type
constructor by making the appropriate type declaration.
The expressions defined inductively as above are called type expressions.
6.1.2 New Names for Old Types
To begin, we shall learn the use of the keyword type, which defines a new type
in a simple way — as an abbreviation for other types. The simplest form of an
abbreviation is
type =
That is, the keyword type is followed by the name we choose for the new type,
an equal sign, and an expression involving existing types.
Example 6.1: We might define the type signal to be a list of reals by
type signal = real list;
type signal = real list
We can then give a value of the appropriate form this new type as
val v = [1.0, 2.0] : signal;
val v = [1.0, 2.0] : signal
Notice that following a value by a colon and a type name declares the value to
be of the given type.
The type signal is nothing more than an abbreviation. For instance, we
can define:6.1. DEFINING NEW TYPES 195
val w = [1.0, 2.0];
val w = [1.0, 2.0) : real list
Here, ML is given a real list and is not told to regard it as of type signal.
However, if we compare v and w as in
yew;
val it = true : bool
ML recognizes that v and w have the same value and does not complain that
one is a signal while the other is a real list. Rather, it recognizes that these
are two designations for the same type. 0
6.1.3 Parametrized Type Definitions
More generally, we can define a family of types with one or more type variables
(identifiers beginning with a quote mark) as parameters. The syntax is
type () =
That is, following the keyword type is a list of type variables serving as type
parameters. If there is only one type variable, the parentheses are optional.
The parameters are followed by an identifier, which is the type constructor for
the type. Finally comes an equal sign and a type expression, which may involve
the parameters.
Types in the defined family are described by providing type expressions
corresponding to the type parameters and following the type expressions by the
type constructor for the type. An example should help to make these ideas
clearer.
Example 6.2: A useful data structure for remembering and retrieving an
association between data of two types is the mapping (not to be confused with
the “map” function of Sections 5.4.2 or 5.6.3, In ML, we can think of this
structure as a list of pairs. The first component of each pair is of some type *4,
called the domain type, and the second component is of some type ’r, called
the range type.’ In a mapping we do not expect to see two pairs with the same
domain element, although there is nothing in the type definition that requires
uniqueness of domain elements.
For instance, we might wish to store a count of words in a document as @
list of pairs of the type (string * int). The first component is a word, and
the second component is the number of times the word occurs. The counts for
the first paragraph of Section 6.1 would include such pairs as
‘Note that the terms “domain” and “range” are used in connection with both mappings
and functions. ‘There is no coincidence; the mapping and function describe similar mathe-
matical objects. A mapping associates pairs of values by listing the pairs, while a function is
‘a program that computes the second component of a pair from the first component.196 CHAPTER 6. DEFINING YOUR OWN TYPES
CO"in",6), (va",1), ("as",2), ("types",4), ("ML", 4),...]
We can see such a set of pairs as assigning an integer value to each domain
element (a word) that is mentioned. Mathematicians would say that words are
thereby “mapped” to integers.
Here is a definition of the parameterized type constructor mapping.
type Cd, ’r) mapping = (’d * ’r) List;
type (‘a, ’b) mapping = (‘a * ’b) list
A few important points about this type definition:
‘© SML/NJ uses ’a, *b, and so on for type variables, regardless of the type
variables chosen by the programmer.
‘© Note that the list of type parameters *4 and *r is separated by commas
after the keyword type, as if they were parameters of a function. However,
in the type expression (’d * ’r) List, we represent the type of a pair
whose components are respectively of types ’d and ’r by separating the
types with the product-type operator *
We can now stipulate that a certain value is of a particular mapping type. For
example, the “assignment”
val words = [("in",6), ("a",1)] : (string, int) mapping;
val words = [(”in”,6), (a”,1)] + (string, int) mapping
declares identifier words to have a particular value of the type
(string, int) mapping
‘That type is an instance of the parameterized type mapping formed by choosing
the appropriate types for the type parameters ’d and ’r in the definition of
mapping. O
6.1.4 Exercises for Section 6.1
Exercise 6.1.1: Give type definitions (abbreviations) for the following types.
* a) A set of sets, where the type of elements is unspecified and sets are rep-
resented by lists.
b) A list of triples, the first two components of which have the same type
and the third component of which is of some (possibly) different type.
Exercise 6.1.2: Give a value of type (real, real) mapping, where the type
mapping is defined in Example 6.2. Your value should have 3 pairs.6.2. DATATYPES 197
6.2 Datatypes
Since the type declaration is limited to definitions of “abbreviations,” it is of
limited power. Often, we want to create types whose values are new structures.
For instance, with the types learned so far we cannot express the notion of a
tree.
ML has a very powerful mechanism for defining new types called datatypes.
A datatype definition involves two kinds of identifiers:
1. A type constructor that is the name of the datatype. The type constructor
is used to buld types just as type names like mapping of Example 6.2 are
used.
2. One or more data constructors, which are identifiers used as operators to
build values belonging to a new datatype.
6.2.1 A Simple Form of Datatype Declaration
‘The concept of datatypes generalizes such ideas as enumerated types in Pascal
and C or union types in these languages, but it goes far beyond these. Thus,
we shall take the datatype concept in easy stages. In our first example we see a
rather simple use of datatype definition corresponding to an enumerated type.
The datatype declaration consists of the keyword datatype, a name for the
datatype, and a list of data constructors separated by vertical bars.
Example 6.3: Let us define the datatype with name fruit to consist of the
three values Apple, Pear, and Grape.
datatype fruit = Apple | Pear | Grape;
datatype fruit = Apple | Grape | Pear
Identifier fruit is the type constructor for the datatype. The names Apple,
Pear, and Grape are the data constructors for the datatype fruit. Note that
SML/NJ alphabetizes lists of data constructors; they do not necessarily appear
in the order in which they were declared.
‘We can use the new datatype in a function or other expression. For instance,
we can write
fun isApple(x) = (x=Apple);
val isApple = fn: fruit + bool
The function isApple returns true if its argument is Apple and false for any
other fruit. The function isApple equates its parameter x to a value Apple
of type fruit. Thus, the ML compiler will infer that the argument type of
isApple is fruit. It is an error to pass as an argument anything that is not
one of the data constructors for the datatype fruit. For instance, ispple
makes the following responses.198 CHAPTER 6. DEFINING YOUR OWN TYPES
Capitalization Convention
‘There is a common ML convention regarding the capitalization of the first
letter in an identifiers.
1. Capitalize the first letter of:
(a) Data constructors.
(b) Exception names (often called ezception constructors)
(c) Structure names. We have seen a few built-in structure names
such as Text 10 or Int, but we shall not cover user-defined struc-
tures until Section 8.2.
(d) Functors (covered in Section 8.3).
2. Do not capitalize the first letter
(a) Variables.
(b) Function names.
(c) Type constructors.
3. Spell “signatures” with all capitals. We shall introduce signatures
in Section 8.2.1; they are essentially the type of a structure.
Some ML programmers absolutely refuse to start any variable or func-
tion name with a capital. We shall not be so strict, since it is often conve-
nient to remind the reader of a variable’s type by a capital. For instance,
we have found it useful to distinguish a list L from its elements, whose
names are uncapitalized, or to distinguish a polynomial P from an uncap-
italized coefficient.
isApple(Pear) ;
val it = false : bool
isApple (Apple) ;
val it = true : bool
isApple (Banana) ;
Error: unbound variable or constructor: Banana
The last response indicates that Banana is not an acceptable argument for the
function isApple. 06.2. DATATYPES 199
We may observe from Example 6.3 something about the form of datatype
definitions that serve as enumerated types. The keyword datatype is followed
by the name of the type, an equal sign, and a list of the data constructors
separated by vertical bars.
« Remember that type abbreviations are introduced by the keyword type,
but datatypes with data constructors require the keyword datatype.
© Notice that datatype definitions, even in the simple case of Example 6.3,
define a new type that is not an abbreviation for any other type.
« For each type there is a set of values. We know, for instance, that the
values for the type int are the integers. Data constructors are used to
build the expressions that are the values for user-defined datatypes. In
Example 6.3, the data constructors are the values. We shall see that data
constructors for more complex datatypes may be combined in powerful
ways to build the set of possible values for a type.
6.2.2 Using Constructor Expressions in Datatype Defini-
tions
Now we take up the more general form of datatype definition, where
1. Type variables can be used to parameterize the datatype, just as they can
for type definitions.
2. The data constructors can take arguments.
This form of a datatype declaration is
datatype () =
I
I
. I
That is, we use the keyword datatype followed by a list of zero or more type
variables used as parameters in the type expressions that follow. Parentheses
are optional if there is one type parameter, and illegal if there are zero type pa-
rameters, as in the datatype of Example 6.3. The type parameters are followed
by an identifier, which is the type constructor for the datatype. Finally come
the equal sign and one or more constructor expressions separated by vertical
bars
‘A constructor expression consists of a constructor name, the keyword of,
and a type expression. A simple example is
Banana of int200 CHAPTER 6. DEFINING YOUR OWN TYPES
This constructor expression says that values of the datatype being defined can
have the form Banana(23), or in general, Banana(i) for any integer i. Some
important points to remember are:
« Notice that the data constructor is used to “wrap” the data with (op
tional) parentheses. In an expression like Banana (23), we not only get an
integer value, 23, but we are told by the data constructor Banana some-
thing about the form, meaning, or origin of this value; perhaps a bunch
with 23 bananas is represented.
¢ Thus, data constructors are “applied” to data as if they were functions,
but they are not functions. Rather, we use constructors to form symbolic
expressions whose appearance is similar to that of an expression involving
function applications.
# Do not confuse data constructors with type constructors. Data construc-
tors are used to build expressions that are values for the type. Type
constructors are used in expressions that denote types themselves.
Example 6.4: The next example is one in which the datatype capability of
ML is used in a manner similar to the union types found in Pascal or C, among
other languages. The idea is to manufacture a type whose elements are formed
from either of two previously defined types. We use two data constructors, each
of which wraps elements of one of the two types, and thus tells us which of the
two types is found inside the wrapping.
We want to deal with “elements” that may be pairs or singles. The first
component of each element will be of some type ’a, and the second component,
if it exists, will be of some type "b. We shall call the datatype elenent. It will
have two data constructors, P, which forms a pair, and S, which forms a single.
In effect, the first type of the union is ’a and the second type is ’a * °b. The
declaration and ML response are shown in Fig. 6.
(1) datatype (a, ’b) element =
(2) Pof ’a* ’b|
(3) S of ’a;
datatype (’a,"b) element = P of 'a *’b| S of 'a
Figure 6.1: Datatype that is the union of singles and pairs
In line (1) of Fig. 6.1 we see that a datatype is being declared; it has
two type variables as parameters, a and ’b. The name of the datatype is
“Ca, ’b) element.” The identifier element becomes a binary type construc-
tor, that is, a type constructor that applies to a pair of types, just like the
type constructor mapping of Example 6.2. Line (2) tells us about the data con-
structor P, which takes as data a pair consisting of an ’a value and a ’b value6.2. DATATYPES 201
and “wraps” them in the symbol P. Similarly, line (3) tells us about the data
constructor 8, which takes an °a value and wraps it with an S. Note that the
type produced is an (a, *b) element, even though the data itself does not
involve a value of type ’b in this case.
Now let us see how the element datatype can be used. We can let the type
parameters ’a and ’b be anything we choose, but for an example let a be
string and ’b be int. To get concrete, we might wish to extend the word-
count problem of Example 6.2 to allow the list to include some words that
are not present (represented by “singles”), while pairs represent a word that
is present, along with its count of occurrences. For instance, a list of elements
representing the first paragraph of Section 6.1 might include
(P("in",6), S("function"), P("as",2),
Suppose we want to take a list of (string, int) element’s and sum the
integers in the second components of those elements that have second compo-
nents. The function sumE1List in Fig. 6.2 does this task.
(1) fun sumElList (nil) = 0
(2) | sumE1List(S(x)::L) = sumE1List(L)
(3) | sumE1List(P(x,y)::L) = y + sumE1List(L);
val sumBlList = fn : (’a, int) element list + int
Figure 6.2: Summing second components of element’s
Line (1) handles the basis case; when the list is empty the sum is 0. Line (2)
handles the case where the first element is a single. Then, there is no contribu-
tion to the sum from the head element, so the result is obtained by a recursive
application of sumE1List to the tail. Line (3) handles the case where the head
clement is a pair. We recursively apply the function to the tail and then add
to the resulting sum the second component of the pair at the head.
‘© The function sumE1List does not constrain the type for first components
of pairs. However, line (1) tells us the result of sumE1List is an integer.
The addition on line (3) must therefore be integer addition, so the second
components of pairs are integers. Thus, the domain type for the function
sumElList is (’a, int) element list
# In lines (2) and (3), we use the data constructors P and $ as part of the
pattern to distinguish the two cases of elements. This style is very common
when we program with datatypes. We use one pattern for each data
constructor, so each kind of value belonging to the datatype is handled
appropriately.
Finallv. we can apply function sumE1List to a particular list as202 CHAPTER 6. DEFINING YOUR OWN TYPES
sumElList [P("in",6), S("function"), P("as",2)];
val it = 8; int
When we apply the function sumE1List to this particular list, we deduce that
the type ’a for this list is string. Notice that we have safely omitted the paren-
theses around the argument of sumE1List, since the square brackets around a
list guarantees that the argument cannot be misinterpreted. 0
6.2.3 Recursively Defined Datatypes
The datatype element of Example 6.4 did not involve nesting of constructors to
build values. Rather, we only applied each constructor to appropriate values to
form the values of the new type. In many interesting and important examples,
values are built by applying the data constructors recursively to build arbitrarily
large expressions.
Example 6.5: A (labeled) binary tree is defined recursively as follows
BASIS: The empty tree is a binary tree.
INDUCTION: If T, and Tp are binary trees, and a is a label, then we may form
another binary tree T by creating a node with label a, left subtree T;, and right
subtree T;. The new node is the root of T.
We represent the empty tree by the absence of any mark. A node is repre-
sented by its label, a line to the lower left running to the root of its left subtree,
and a line to its lower right running to the root of its right subtree. If either
subtree is empty, we omit the line to that subtree. Figure 6.3 shows an example
binary tree with strings as labels.
é™
Figure 6.3: An example of a binary tree labeled by strings
Figure 6.4 is a datatype declaration for a binary tree with a type parameter
>label representing the type of labels in the tree. The type constructor for
this datatype is btree. In any use, the variable ’ label would be replaced by
the actual type. For instance, the binary tree of Fig. 6.3, having string labels,
is a value of type string btree.
Line (1) of Fig. 6.4 declares the name of the datatype, btree, and its type
parameter *label. Since there is onlv one tvpe parameter. we have exercised6.2. DATATYPES 203
(1) datatype ’label btree =
2) Empty |
(3) Node of *label * *label btree * ‘label btree;
datatype ‘a btree = Empty | Node of ’a * 'a btree * ‘a btree
Figure 6.4: Datatype definition for binary trees
our option to omit parentheses around the parameter. Line (2) gives Empty as
a data constructor. This constructor takes no argument and will appear only as
an identifier, just as the fruit names used as data constructors in Example 6.3
appear by themselves. Line (3) introduces the data constructor Node, which is
applied to a triple of values. The first is of type ?label and the second two
are of type "label btree; that is, they are of the same type as the type being
defined. The response of ML repeats these points.
The values of type *label btree are defined recursively as follows.
BASIS: The data constructor Empty is a value.
INDUCTION: An expression of the form Node(a, L, R) is a value, if a is of
the ’Label type and L and R are label btree’s, that is, binary trees with
appropriately typed labels.
In general, the values for a datatype are those that are constructed by
applying the data constructors to values of the appropriate types, as many
times as we wish.
For instance, consider the tree of Fig. 6.3. First, we note that the la-
bels are strings, so abel must have the value string, and this tree is a
string btree. The leaves (nodes with two empty subtrees) are represented by
an expression of the form Node(a, Empty, Empty), where a is the label of the
node. For instance, the node labeled "types" is represented by the expression
Node("types", Empty, Empty).
Then we can work up the tree, constructing the expression for a node after
the expressions for its two subtrees have been constructed. Thus, after handling
all three leaves, we can work on the node labeled "as". This node has the
expression
Node("as", Node("a",Empty,Empty), Node("in" ,Empty Empty) )
That is, the first component is the label, "as". The second component is
the expression for the left subtree, which consists of the leaf labeled "a"; this
has the expression Node("a", Empty, Empty). The third component is a
similar expression for the leaf labeled "in".
Finally, the expression for the root uses the expression for the node labeled
"as" for its left subtree and the expression for the leaf labeled "types" as its204 CHAPTER 6. DEFINING YOUR OWN TYPES
right subtree. When combined with the label at the root, we find that the
expression for the entire tree is that shown in Fig. 6.5. 0
Node ("ML"
Node("as",
Node("a" ,Empty ,Empty) ,
Node("in" , Empty , Empty)
»
Node("types"
Empty ,Empty)
Figure 6.5: A value of type string btree
6.2.4 Mutually Recursive Datatypes
Occasionally we need to define two or more datatypes in a mutually recursive
way. We can do so by connecting the definitions with the keyword and. Type,
as well as datatype, definitions may also be connected with and, but there is
less need to do so.
Example 6.6: We can define an even tree to be a binary tree in which each
path from the root to a node with one or two empty subtrees has an even
number of nodes. As a special case, the empty tree, whose paths we may
regard as having length 0, is an even tree. Similarly, an odd tree is a binary tree
all of whose paths from the root to a leaf or to a node with one empty subtree
have an odd number of nodes. The tree of Fig. 6.3 is neither even nor odd,
because there is an even-length path from the root to leaf "types" and there
are odd-length paths from the root to the other two leaves
There is a simple, mutually recursive definition of the datatypes evenTree
and oddTree.
BASIS: The empty tree is an even tree.
INDUCTION: A node with a label of type *label and two subtrees that are
odd trees is the root of an even tree. A node with a label of type ’label and
two subtrees that are even trees is the root of an odd tree.
Lines (1) through (4) of Fig. 6.6 show this mutually recursive definition in
ML. Line (1) makes Empty a constructor for even trees, and line (2) makes
Enode be a constructor that takes a label and two odd trees to construct an
ven tree. Similarly, line (4) makes Onode the only constructor of odd trees,
taking a label and two even trees.
Lines (5) through (9) use the data constructors Onode and Enode to build
some odd and even trees. Line (5) creates an odd tree consisting of a single6.2. DATATYPES 205
datatype
qa) label evenTree = Empty |
(2) Enode of ‘label * ’label oddTree * ’label oddTree
and
(3) label oddTree =
@) Onode of ’label * ’label evenTree * ‘label evenTree;
datatype ’a evenTree = Empty | Enode of * ’a oddTree * ’a oddTree
datatype a odd Tree = Onode of ’a * 'a evenTree * ’a evenTree
(5) val ti = Onode(1,Empty Empty) ;
val t1 = Onode(1,Empty,Empty) : int oddTree
(6) val t2 = Onode(2,Empty ,Empty) ;
val t2 = Onode(2,Empty,Empty) : int oddTree
(7) val t3 = Enode(3,t1,t2);
val t3 = Enode(3, Onode(1,Empty, Empty),
Onode(2,Empty,Empty)) : int even Tree
(8) val 4 = Onode(4,t3, Empty) ;
val t4 = Onode(}, Enode(3, Onode #, Onode #), Empty) :
int oddTree
(9) val t5 = Enode(5,t4,t4);
val 5 = Enode(5, Onode(4, Enode #, Empty),
Onode(4, Enode #, Empty)) : int evenTree
Figure 6.6: Mutually recursive datatype definitions*
206 CHAPTER 6. DEFINING YOUR OWN TYPES
node labeled 1. Note that ML now deduces that for this tree the type ‘label
is integer. Line (6) similarly creates a node labeled 2. It, like all single-node
trees, is an odd tree.
Line (7) uses the two odd trees from the previous two lines to create an even
tree whose root has label 3, whose left subtree is the single node labeled 1, and
whose right subtree is the single node labeled 2. Notice ML’s response, which
gives the expression for this tree and identifies its type as an int evenTree.
Line (8) builds another odd tree by taking a root node with label 4, the
even tree ¢3 from line (7), and an empty even tree as left and right subtrees.
Finally, line (9) takes a root node labeled 5 and two copies of the odd tree
+4 from line (8) and creates another even tree. Each # in the response to
line (9) represents the tree t3. Figure 6.7(a) shows the complete expression for
+5 expanded out, while Fig. 6.7(b) is a picture of this even tree. Note that each
root-to-leaf path has an even number of nodes. 0
The reader familiar with a language like Pascal or C that uses pointers as
a type constructor may have seen trees constructed by pointers in records that
represent nodes. Notice that the ML approach is somewhat different. The
value of the tree is represented in ML programs by an expression built from
data constructors applied to arguments. For example, the value of t3 printed
after line (7) of Fig. 6.6 does not involve any pointers to the values of t1 or t2.
The distinction is seen if identifiers such as t1 or t2 are redefined. Then,
the value of t3 does not change. However, had a tree like t3 been constructed
in Pascal or C, with pointers to the trees represented by ti and t2, the value
of tree t3 would change as a side-effect of the change to trees t1 or t2.
6.2.5 Exercises for Section 6.2
Exercise 6.2.1: Give an example of a value of type int btree, where btree
is the datatype defined in Example 6.5. Your tree should have 3 nodes.
Exercise 6.2.2: Define a type (not a datatype) mapTree that is a specializa-
tion of the btree datatype to have a label type that is a set of domain-range
pairs. Then, define a tree t1 that has a single node with the pair ("a",1) at
the root.
Exercise 6.2.3: Write a function that takes a btree as its argument and
returns a pair consisting of the left and right subtrees. Define an exception for
the erroneous case where the tree is empty.
Exercise 6.2.4: In Fig. 4.10 we wrote a program to read and sum integers,
using the awkward convention that —1 represented the situation where the end
of file had been reached and no integer was available. A better approach is
to define a datatype intOrEof that has one data constructor Eof to represent
the absence of an integer and another data constructor Integer that wraps an
integer. Rewrite Fig. 4.10 to use this strategy and avoid the use of —1, which
was represented there by the identifier END.6.2. DATATYPES 207
Enode(5,
Onode(4,
Enode(3,
Onode(1,Empty ,Empty) ,
Onode(2,Empty ,Empty)),
Empty
»
Onode(4,
Enode(3,
Onode(1,Empty Empty) ,
Onode(2,Empty ,Empty)) ,
Empty
>
(a) Nested expression for tree tS
5
4 NN
4 4
4
3 3
AN
0 \ 2 q 2
(b) Picture of tree t8
Figure 6.7: Representations of the tree t5 from Fig.6.6208 CHAPTER 6. DEFINING YOUR OWN TYPES
Eliding Parts of Values
Notice in the response that SML/NJ only shows complicated expressions
for a fixed number of levels and elides deeper structure with the # sign.
For instance, in the response to line (8) of Fig. 6.6 the first # stands for
the tree t1, or Onode(1 ,Empty ,Empty), and the second # stands for t2,
or Onode(2, Empty ,Empty).
Options as a Datatype
Notice that the option type constructor first introduced in Section 4.2.5 is
actually a built-in datatype, with constructors NONE and SOME of ’a for
any type ?a. Its uses are similar to that of the datatype intOrEof in Ex-
ercise 6.2.4. In fact, if we don’t mind the less descriptive data constructors
NONE and SOME, we can use option in place of intOrEof, with data con-
structor NONE replacing Eof and SOME of int replacing Integer of int.
! Exercise 6.2.5: Tell whether a type or a datatype declaration would be more
suitable for the following. Give the appropriate declaration.
* a) A type whose values are the suits of a card deck.
b) A type whose elements are either lists of (only) integers or lists of (only)
reals.
* c) A type whose values are “things,” where a “thing” is either an integer or
a list of “things.”
d) A (parameterized) type whose values are pairs whose components can be
of any type, as long as they are of the same type.
! Exercise 6.2.6: Define mutually recursive datatypes zeroTree, oneTree, and
twoTree to be those binary trees whose every path from the root to a node
with at least one empty subtree has length whose remainder when divided by
3 is 0, 1, or 2 respectively.
*! Exercise 6.2.7: We can define a graph with nodes of some type ’node as a list
Each pair consists of a node of type ’node and a list of its successor
a) Write this type definition.6.2. DATATYPES 209
More Equality Types
Both types and datatypes may be used to define more equality types.
© A type is an equality type if the type that it stands for is an equality
type.
A datatype is an equality type if its constructor expressions, if any,
involve only equality types or the datatype itself.
© A mutually recursive collection of datatypes are equality types if
their constructor expressions involve only equality types and the
datatypes in the collection.
b) Write a function suce(a,G) that produces the set (represented by a list)
of successors of node a in graph G. If a is not a node of G, then raise the
exception NotANode.
!¥c) Write a function search(a,G) that finds the set of nodes reachable from
node a in graph G, including a itself. Hint: It helps to write an auxiliary
function search1(L,R,G) that finds all the nodes that are reachable from
one or more of the nodes on list L in graph G, without going through a
node on the list R. Function searchi then returns all the nodes it has
reached plus all the nodes on R. We may use parameter R of searcht to
keep track of nodes we have already reached in our search. We thus avoid
getting trapped in infinite loops, even if the graph G has cycles.
! Exercise 6.2.8: In propositional logic, statements are represented by propo-
sitional variables, which we may think of as identifiers. Logical expressions can
be built from propositional variables by applying a number of logical operators.
In our exercise, we shall define logical expressions and their truth values in a
simple but useful form as follows.
BASIS: A propositional variable is a logical expression. Its truth value may be
assigned to be either true or false.
INDUCTION: If E; and E2 are logical expressions, then
1. AND(E;, Ez) is a logical expression, and its value is true if and only if both
E, and Ey have the value true.
2. OR(E;, E2) is a logical expression, and its value is true if either Ey or Ez
or both have the value true.
3. NOT(E}) is a logical expression whose value is true if and only if the value
of E is false.210 CHAPTER 6. DEFINING YOUR OWN TYPES
‘An example of a propositional expression is AND(OR(p,q), NOT(p)). Do the
following:
a) Devise a datatype whose values represent logical expressions as described
above. You may assume that propositional variables are represented by
strings.
b) Write a function eval (E,L) that takes a logical expression E and a list of
true propositional variables L, and determines the truth value of E on the
assumption that the propositional variables on L are true and all other
propositional variables are false.
6.3 Case Study: Binary Trees
In this section, we shall solidify our familiarity with datatypes by writing a
number of functions on binary trees, using the datatype btree introduced in
Example 6.5. Most of these functions involve the “binary search tree” described
below.
6.3.1 Binary Search Trees
Binary search trees are binary trees whose labels obey a particular property
called the binary search tree property, or BST property, which we shall define
shortly. The BST property only makes sense if there is an ordering relation,
often referred to as <, that allows us to compare values of the label type. For
example, the types int, real, char, and string have this ordering. To be
more general, we shall only assume that there is a predicate 1t(x,y) obeying
the important properties of < on integers, reals, or strings. These properties
are:
1. Transitivity. That is, [4(x,y) and lt(y, 2) imply lt(x, z), just asx < y and
y <2 tell us that x bool,
2. A’a btree.
3. An element of type ’a, and
The range type of lookup is bool. We designed Lookup so that it could be par-
tially instantiated with a particular less-than function to create a two-parameter
function that looks up an element in a binary search tree, using this less-than
function.
Now, let us study how lookup works. Line (1) of Fig. 6.9 handles one of
the basis cases, where the tree is empty. Line (2) provides the pattern that
matches all other cases. Variable x matches the value being searched for, and
Node(y,left,right) matches the expression for any binary tree except the
empty tree. In the match, y acquires the value of the label, left gets the value
of the left subtree, and right gets the value of the right subtree.
Line (3) handles the case where the desired label is less than the label at the
root; we must then search only the left subtree. Line (4) handles the opposite
case, where the desired label is greater than the label at the root and we must
search the right subtree. The only remaining case, in line (5), is when x and
y are equal. Then, z has been found, which is the second basis case. Function
lookup thus returns true at line (5).
Line (6) defines a variable t to be the specific tree of Fig. 6.5. Then, at
line (7), we call Lookup, searching for the word "function" in the tree t, using
the specific comparison function strLT. We first compare "function" with
"ML", and at line (3) go to the left subtree, rooted at "as". There we find
"function" follows "as", so we go to the right, the tree rooted at "in". Next,
we find "function" precedes "in", so we go to the left subtree. This tree is
empty, so at the next call the pattern of line (1) applies, and we return false.
The desired label "function" is not in the tree.
* Notice how pattern matching is used in the function Leokup to determine
which data constructor is used at the outermost layer of the expression
representing the binary tree.
© Pattern matching is also used to pick apart the structure of the tree and
allow us to attack pieces of the expression recursively.
6.3.3 Insertion into Binary Search Trees
Next, let us look at a similar function insert to insert an element into a binary
search tree in the appropriate place. The function insert 1t T x returns6.3. CASE STUDY: BINARY TREES 215
Avoiding Equality Tests in Searches
We have been very careful in lookup not to assume that the type of labels
is an equality type. We use only the given 1t comparison function to
compare elements x and y, and we discover x = y when we determine that
both 2 a + ‘a btree
(6) insert strLT t "function";
val it = Node ("ML”,Node (”as”,Node #,Node #),
Node (”types”,Empty,Empty)) : string btree
Figure 6.10: Insertion into a binary search tree216 CHAPTER 6. DEFINING YOUR OWN TYPES.
Line (1) handles the case of insertion into an empty tree, where a one-node
tree is returned. Line (2), like the same line in Fig. 6.9, matches any nonempty
tree and breaks it into its important components. However, the whole tree is
also matched to the identifier T at line (2), using the keyword as.
In line (3), we are directed to insert into the left subtree. We return a tree
whose root label is the same as it was: y. The right subtree is also the same
as it was, but the left, subtree is modified to be whatever the recursive call to
insert 1t left x produces. That will be the left subtree, modified to include
a node labeled ¢ at the appropriate place. Line (4) does the symmetric thing
when « must be inserted into the right subtree. Finally, line (5) handles the
remaining case, where the element x to be inserted is at the root of the tree T.
Since z is already in the tree, we need make no change and can just return T.
Line (6) shows a call to insert with first argument "function", second
argument the tree t defined in Fig. 6.9, and the comparison function strLT
from that figure. We eventually find our way to the empty tree that is the left
subtree of the node labeled "in". In the tree constructed as the return value,
that empty subtree is replaced by a tree with one node labeled "function". The
resulting two-node tree replaces the one-node tree whose node is labeled "in",
yielding a four-node tree to replace the three-node tree whose root is labeled
"as". Figure 6.11 shows the tree returned by the original call to insert.
ML"
ZN
neon "Zypes"
LN
Min"
Zo
"function"
Figure 6.11: Binary search tree after insertion of "function"
6.3.4 Deletion from Binary Search Trees
We can delete a specified element x from a binary search tree, but the strategy
is a bit more complicated than lookup or insertion. As for insertion, we leave
the given tree intact but return a modified version of the tree in which x does
not appear. The idea is expressed by the following recursive algorithm, which
curiously has most of the effort in the basis case where the targeted clement is
found at the root of the tree,6.3. CASE STUDY: BINARY TREES 217
BASIS: Nothing needs to be done to delete x from an empty tree; z is not there
anyway. Just return the given tree. To delete x from a tree whose root has
label x, modify the tree to remove x and maintain the BST property as follows.
1. If the root has at least one empty subtree, replace the tree by the other
subtree.
2. If the root has two nonempty subtrees:
(a) Find the least element in the right subtree,
(b) Delete it from the right subtree (which is always an example of
case (1) above, since the least element in a binary search tree must
have an empty left subtree), and
(c) Make the least label be the label of the root, replacing 2.
Return the resulting tree.
INDUCTION: To delete x from a tree whose root has label y, where ¢ # y,
recursively delete z from the left subtree if z < y and from the right subtree if
ys.
Figure 6.12 gives the necessary code to implement this algorithm. To un-
derstand it, start with the function delete of lines (7) through (16). Line (7)
handles the first part of the basis, where we try to delete x from the empty tree
and simply return the empty tree. In line (8) we handle all other cases, where
we need to dissect it into its important components as we did for lookup and
insert.
Lines (9) and (10) cover the inductive cases, where we must delete from the
left or right subtree. For instance, in line (9) we must delete from the left. To
do so, we assemble the result by taking the original label y, the original right
subtree, and the left subtree that we get by deleting x from the original left
subtree.
Lines (11) through (16) handle the hard basis case, where « has been found
at the root and we must rearrange the tree. Line (11) is the beginning of a
case statement, in which we consider whether one or both of the left. and right
subtrees are empty. Lines (12) and (13) are for the case where one of the
subtrees is empty and we return the other.
If neither subtree is empty, then the case of line (14) applies. We call the
function deletemin (to be described next) on the right subtree at line (15).
This function returns a pair:
1. z, the smallest label in the right subtree, and
2. r4, the tree that results from deleting z from the original right subtree.218 CHAPTER 6. DEFINING YOUR OWN TYPES
(1) exception EmptyTree;
exception Empty Tree
(* deletemin(T) returns a pair consisting of the least
element y in tree T and the tree that results if we
delete y from T. It is an error if T is empty *)
(2) fun deletemin(Empty) = raise EmptyTree
(3) | deletemin(Node(y,Empty,right)) = (y,right) (* The
critical case. If the left subtree is empty,
then the element at current node is min. *)
(4) | deletemin(Node(w,left right) =
let
(8) val (y,L) = deletemin(left)
in
(6) (y, Node(w,L,right))
end;
val deletemin = fn: ’a btree + ’a * 'a btree
(7) fun delete 1t Empty x = Empty
(8) | delete 1t (Node(y,left,right)) x =
(9) if 1t(x,y) then Node(y, (delete 1t left x),right)
(10) else if 1t(y,x) then Node(y,left, (delete 1t right x))
else (* x=y *)
at) case (left,right) of
(12) (Empty,r) => r |
(13) (1,Empty) => 1 |
(14) Q,r) =>
let
(15) val (z,r1) = deletemin(r)
in
(16) Node(z,1,r1)
end;
al delete = fn : (‘a * a+ bool) + ’a btree + 'a + a btree
Figure 6.12: Deletion from a binary search tree6.3. CASE STUDY: BINARY TREES 219
In line (16) we assemble the result, which has z in place of x at the root, the
original left. subtrec, and the revised right subtree r1. Note that because 2 is
the least label of the right subtree, the BST property is satisfied with z at the
root. It surely precedes anything in tree r1, and because z < 2, it must be that
anything in the left subtree precedes 2 as well as z.
Now let us consider the function deletemin. This function is only called on
nonempty trees, so we use the exception EmptyList in line (1) and raise it in
line (2) if somehow the function is called on an empty tree. Note that, unlike
the functions insert, delete, and lookup, function deletemin does not need
the comparison function 1t as an argument.
‘The least label in a binary search tree is found by following left branches
until we reach a node that has an empty left subtree. The situation is suggested
in Fig. 6.13. Thus line (3) handles the case where we are at a tree with an empty
left subtree. We return the pair consisting of:
1. The label of this node, which must be the least element, and
2. The right subtree, which is what is left when we delete the node with the
least element.
Notice that this deletion is an easy basis case, where one of the subtrees is
empty.
least
Figure 6.13: Locating the least label in a tree
Lines (4) through (6) handle the case where the left subtree is not empty and
we must search further. In line (5) we recursively apply deletemin to the left
subtree, obtaining a pair (y, L) consisting of the least label y and what remains
of the left subtree after deleting y. In line (6) we assemble the desired result,
a pair with y as the first component. For the second component we construct
a tree with the original label w, the new left subtree L, and the original right
subtree.
6.3.5 Some Comments About Running Time
We can, and shall, demonstrate that the operations of lookup, insert, and delete
on binarv search trees take time proportional to the length of the path followed.220 CHAPTER 6. DEFINING YOUR OWN TYPES
A path in a “typical” binary search tree will have length that is logarithmic
in the number of nodes in the tree. So, binary search trees are a very efficient
way to represent large sets if we only want to perform these three operations.
We shall not try to prove the contention that the typical path is logarithmic in
length; the reader may rely on this observation if he or she is unfamiliar with
the lore surrounding binary trees.
However, it is useful to verify the fact that the time taken by these algorithms
is proportional to the length of the path traversed. Function lookup in Fig. 6.9
is the easiest to analyze. Line (1) takes a constant amount of time, independent
of the size of the tree, to test whether the second argument is Empty and return
false if so. Line (2) takes a constant amount of time to examine the root node
of the tree and match variables x, y, left, and right in the pattern. Line (5)
takes a constant amount of time to return true if we reach that line.
Lines (3) and (4) require constant time to apply function 1t.? We must also
consider the time consumed in line (3) or (4) by the recursive call to lookup.
However, whether we call lookup 1t left x or lookup 1t right x, we have
moved one node down the tree. Thus as we follow our path down the tree,
we spend only a constant amount of time at each node and then move down.
Hence the total amount of time spent is proportional to the length of the path
followed.
The argument about insert in Fig. 6.10 is similar. The only difference is
that after making the recursive call to insert at line (3) or (4), we have to
assemble a new tree, for example by evaluating the expression
Node(y, (insert 1t left x) ,right)
in line (3). Is it possible that we have to copy the tree produced by
insert 1t left x
in this situation?
Fortunately, it is not. ML is implemented in such a way that this value-
construction is done by pointers, just as it is typically done by programmers us-
ing conventional languages. In ML, these pointers are “behind the scenes,” and
the user may imagine that the trees themselves are manipulated. In practice,
the construction of a tree from given trees — or in general, any fixed number of
steps of value-construction using data constructors — takes a constant amount
of time, independent of the size of the values being manipulated.
Interestingly, we avoid copying values even in situations where it appears to
be necessary. For example, consider line (11) of Fig. 6.6, where we construct
the value of t& from two copies of the value of t4. In ML implementations, the
value of t& can have two pointers, each to the value of t4. Because values of
variables do not change in ML, we can rely on the value of t4 remaining the
Technically, 1t could be any function that returns a boolean, no matter how complicated.
But 1t does not get the tree as an argument, and so its running time could not depend on
the size of the tree.6.3. CASE STUDY: BINARY TREES 221
same should we ever need the value of t5. In other languages, we normally
could not take this risk.
6.3.6 Visiting All the Nodes of a Binary Tree
The previous examples each have the property that we follow one path down
a binary tree from the root to a leaf, There is another class of functions that
operate on a tree by visiting each node in a systematic order. We shall give two
simple examples.
Example 6.8: Let us write a function sum(T) that takes a binary tree (not
necessarily a binary search tree), defined by the btree datatype of Fig. 6.4,
and sums the labels of all the nodes, which we shall assume are integers. The
function sum is written below:
fun sum(Empty) = 0
| sum(Node(a,left,right)) = a + sum(left) + sum(right);
val sum = fn: int * int btree + int
That is, the sum over an empty tree is 0, and the sum over any other tree is
the label at the root plus the sums over the left and right subtrees. As we see
from the ML response, the label type is forced to be integer because of the 0
on the first line. 0
6.3.7 Preorder Traversals
Our second example of a function that visits all the nodes of a binary tree
concerns “preorder traversals.” A preorder traversal of a tree is a listing of the
node labels by the following recursive algorithm.
1. List the label of the root.
2. In order from the left, list all the nodes of each subtree in preorder.
Example 6.9: Consider the tree of Fig. 6.8. To list its labels in preorder, we
first list the label at the root, "ML". Then we work on the 3-node subtree rooted
at "as". We list "as", next work on the left subtree "a" and finally work on
the right subtree "in". Any I-node tree is listed in preorder by listing the label
alone, so we follow "as" by "a" and then "in".
Last, we return to the root, having listed its left subtree but not its right
subtree. The right subtree, being a I-node tree, is listed by listing its label
"types". Thus
CoML", "as", "a", "in", "types")
is the complete preorder listing. 0222 CHAPTER 6. DEFINING YOUR OWN TYPES
Here is a function preOrder(T) that lists a binary tree in preorder.
fun preOrder (Empty) = nil
| preOrder(Node(a,left,right)) =
[al @ preOrder(left) @ preOrder(right) ;
val preOrder = fn: ‘a btree + ‘a list
This function follows the definition of preorder in a straightforward way. An
empty tree yields nothing, while any other tree yields the label of its root,
followed by the preorder listings of its left and right subtrees. Note that if
cither or both of those subtrees is empty, it will not produce any contribution
to the preorder listing
6.3.8 Exercises for Section 6.3
Exercise 6.3.1: Write functions to list the nodes of a binary tree in:
a) Postorder, in which the label at the root follows the postorder traversals
of the left and right subtrees.
b) Inorder, in which the label of the root is in between the inorder traversals
of the left and right subtrees.
Exercise 6.3.2: Suppose we define the type mapTree as in Exercise 6.2.2 (see
the solutions if you have not worked this exercise yourself). This type is a
binary tree whose labels are pairs, which we may think of as the domain and
range values of a pair in some mapping. We may use a mapTree as a sort of
binary search tree, if we use a < ordering on the domain (first) component of
each pair only.
* a) Write a function lookup 1t T a that searches tree T for a pair (a,b)
for some 6 and returns b. The comparison function 1t compares domain
elements of the pairs at the nodes of tree T and guides our search down
the tree. If there is no such pair, then raise the exception Missing.
b) Write a function assign 1t T a b that looks in tree T for a pair (a,c)
and, if found, replaces c by b. If no such pair is found, assign inserts the
pair (a, ) in the tree in a position that preserves the BST property. As in
(a), comparison function 1t applies to domain elements of the pairs and
guides the search down the tree T.
Exercise 6.3.3: Partially instantiate the functions lookup, insert, and de~
ete defined in this section to give two-argument functions that operate on a
binary search tree and a value, where the less-than function
* a) < on reals.6.4. CASE STUDY: GENERAL ROOTED TREES 223
b) Lexicographic order on pairs of integers. That is, (a,b) < (¢,d) ifa =
A label may be any identifier, or it may be a string of digits; the latter is a very
important special case, as we shall see.230 CHAPTER 7. MORE ABOUT ML DATA STRUCTURES
‘The type of a record is a comma-separated list of elements of the form