The SELF 2.0 Programmer's Reference Manual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 122

The SELF Programmers Reference Manual

Copyright (c) 1992, Sun Microsystems, Inc. and Stanford University. All Rights Reserved. Sun Microsystems, Inc 2550 Garcia Avenue Mountain View, CA 94043 USA RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth in subparagraph (c) (1) (ii) of the Rights in Technical Data and Computer Software Clause at DFARS 252.227-7013 (Oct. 1988) and FAR 52.227-19(c) (June 1987). SOFTWARE LICENSE: The software described in this manual may be used internally, modied, copied and distributed to third parties, provided each copy of the software contains both the copyright notice set forth above and the disclaimer below. DISCLAIMER: Sun Microsystems, Inc. makes no representations about the suitability of this software for any purpose. It is provided to you "AS IS", without express or implied warranties of any kind. Sun Microsystems, Inc. disclaims all implied warranties of merchantability, tness for a particular purpose and non-infringement of third party rights. Sun Microsystems, Inc.'s liability for claims relating to the software shall be limited to the amount, if any of the fees paid by you for the software. In no event will Sun Microsystems, Inc. be liable for any special, indirect, incidental, consequential or punitive damages in connection with or arising out of this license (including loss of prots, use, data, or other economic advantage), however it arises, whether for breach of warranty or in tort, even if Sun Microsystems, Inc. has been advised of the possibility of such damage.

Table of Contents

Introduction
1 Overview of the SELF System ................................................................................................................... 2 1.1 The system ......................................................................................................................................... 2 1.2 The translation process ...................................................................................................................... 2

Language Reference
2 Objects ........................................................................................................................................................5 2.1 Syntax ................................................................................................................................................ 5 2.2 Data objects........................................................................................................................................ 5 2.3 The assignment primitive................................................................................................................... 6 2.4 Objects with code............................................................................................................................... 6 2.5 Construction of object literals............................................................................................................ 8 3 Slot descriptors ........................................................................................................................................... 9 3.1 Slot privacy ........................................................................................................................................ 9 3.2 Read-only slots................................................................................................................................... 9 3.3 Read/write slots................................................................................................................................ 10 3.4 Slots containing methods ................................................................................................................. 11 3.5 Parent slots ....................................................................................................................................... 12 4 Expressions...............................................................................................................................................13 4.1 Unary messages ............................................................................................................................... 13 4.2 Binary messages............................................................................................................................... 14 4.3 Keyword messages........................................................................................................................... 14 4.4 Implicit-receiver messages............................................................................................................... 15 4.5 Resending messages......................................................................................................................... 16 5 Message lookup semantics ....................................................................................................................... 17 5.1 Message send ................................................................................................................................... 17 5.2 The lookup algorithm....................................................................................................................... 18 5.3 Privacy ............................................................................................................................................. 19 5.4 Resend.............................................................................................................................................. 19 6 Lexical elements ....................................................................................................................................... 23 6.1 Character set..................................................................................................................................... 23 6.2 Identiers ......................................................................................................................................... 23 6.3 Keywords ......................................................................................................................................... 23 6.4 Arguments........................................................................................................................................ 24 6.5 Operators.......................................................................................................................................... 24 6.6 Numbers........................................................................................................................................... 24 6.7 Strings .............................................................................................................................................. 25 6.8 Comments ........................................................................................................................................ 26 Appendix A Appendix B Appendix C Appendix D Glossary ................................................................................................................................. 27 Lexical overview.................................................................................................................... 29 Syntax overview .................................................................................................................... 30 Built-in types.......................................................................................................................... 32

Table of Contents

The SELF World


7 World Organization .................................................................................................................................. 34 7.1 The Lobby........................................................................................................................................ 34 7.2 Names and Paths .............................................................................................................................. 35 8 The Roots of Behavior.............................................................................................................................. 36 8.1 Default Behavior.............................................................................................................................. 36 8.2 The Root Traits: Traits Clonable and Traits Oddball....................................................................... 37 8.3 Mixins .............................................................................................................................................. 37 9 Blocks, Booleans, and Control Structures ................................................................................................ 38 9.1 Booleans and Conditionals .............................................................................................................. 38 9.2 Loops................................................................................................................................................39 9.3 Block Exits....................................................................................................................................... 40 9.4 Other Block Behavior ...................................................................................................................... 40 10 Numbers and Time ................................................................................................................................... 40 10.1 Random Numbers ............................................................................................................................ 41 10.2 Time .................................................................................................................................................41 11 Collections ................................................................................................................................................42 11.1 Indexable Collections....................................................................................................................... 42 11.2 Strings, Characters, and Paragraphs................................................................................................. 43 11.3 Unordered Sets and Dictionaries ..................................................................................................... 44 11.4 Tree-Based Sets and Dictionaries .................................................................................................... 44 11.5 Lists and PriorityQueues.................................................................................................................. 45 11.6 Constructing and Concatenating Collections................................................................................... 45 12 Pairs ..........................................................................................................................................................46 13 Mirrors ......................................................................................................................................................46 14 Messages...................................................................................................................................................47 15 Processes and the Prompt ......................................................................................................................... 48 16 Foreign Objects ........................................................................................................................................ 48 17 I/O and Unix ............................................................................................................................................. 49 18 Other Objects............................................................................................................................................ 50 Appendix E Glossary of Useful Selectors.................................................................................................. 51

A Guide to Programming Style


19 Behavioralism versus Reection .............................................................................................................. 59 20 Objects Have Many Roles ........................................................................................................................ 60 20.1 Shared Behavior............................................................................................................................... 60 20.2 One-of-kind Objects (Oddballs) ...................................................................................................... 60 20.3 Using Objects for Organization ....................................................................................................... 60 20.4 Inline Objects ................................................................................................................................... 61 21 Avoiding Ambiguous Message Errors...................................................................................................... 61 22 Naming and Printing................................................................................................................................. 62 22.1 How objects are printed ................................................................................................................... 62 22.2 How to make an object print............................................................................................................ 62 23 How to Return Multiple Values ................................................................................................................ 63

ii

Table of Contents

24 25 26 27

Substituting Values for Blocks ................................................................................................................. 64 nil Considered Naughty ............................................................................................................................ 65 Hash and =................................................................................................................................................65 Equality, Identity, and Indistinguishability............................................................................................... 65

Virtual Machine Reference


28 29 30 31 32 System-triggered messages ...................................................................................................................... 68 Run-time message lookup errors .............................................................................................................. 69 The initial SELF world ............................................................................................................................. 70 Option primitives...................................................................................................................................... 73 Interfacing with other languages .............................................................................................................. 74 32.1 Proxy and fctProxy objects .............................................................................................................. 74 32.2 Glue code ......................................................................................................................................... 75 32.3 Compiling and linking glue code..................................................................................................... 75 32.4 A simple glue example: calling a C++ function .............................................................................. 76 32.5 C glue ...............................................................................................................................................77 32.6 C++ glue .......................................................................................................................................... 81 32.7 Conversion pairs .............................................................................................................................. 83 32.8 A complete application using foreign functions .............................................................................. 88 Appendix F VM configuration .................................................................................................................. 93 Appendix G The system monitor ............................................................................................................... 94 Appendix H Primitives ............................................................................................................................... 97

References

iii

Introduction

Introduction

Overview of the SELF System

1 Overview of the SELF System


This section contains an overview of the system and its implementation; it can be skipped if you wish to get started as quickly as possible.

1.1 The system


Although SELF runs as a single UNIX process, it really has two parts: the virtual machine (VM) and the SELF world, the collection of SELF objects that are the SELF prototypes and programs:
SELF world SELF virtual machine

Figure 1 The SELF system The VM executes SELF programs specied by objects in the SELF world and provides a set of primitives (which may be thought of as methods written in C++) that can be invoked by SELF methods to carry out basic operations like integer arithmetic, object copying, and I/O. The SELF world distributed with the VM is a collection of SELF objects implementing various traits and prototypes like cloning traits and dictionaries. These objects can be used (or changed) to implement your own programs.

1.2 The translation process


SELF programs are translated to machine code in a two-stage process (see Figure 2). Code typed in at the prompt or read in from a le is parsed into SELF objects. Some of these objects are data objects; others are methods. Methods have their own behavior which they represent with bytecodes. The bytecodes are the instructions for a very simple virtual processor that understands instructions like push receiver or send the x message. In fact, SELF bytecodes correspond much more closely to source code than, say, Smalltalk-80 bytecodes. (See [CUL89] for a list of the SELF byte codes.) The raison dtre of the virtual machine is to pretend that these bytecodes are directly executed by the computer; the programmer can explore the SELF world down to the bytecode level, but no further. This pretense ensures that the behavior of a SELF program can be understood by looking only at the SELF source code. The second stage of translation is the actual compilation of the bytecodes to machine code. This is how the execution of bytecodes is implementedit is totally invisible on the SELF level except for side effects like execution speed and memory usage. The compilation takes place the rst time a message is actually sent; thus, the rst execution of a program will be slower than subsequent executions.
Actually, this explanation is not entirely accurate: the compiled method is specialized on the type of the receiver. If the same message is later sent to a receiver of different type (e.g. oat instead of integer), a new compilation takes

UNIX is a trademark of AT&T Bell Laboratories.

Introduction

Overview of the SELF System

place. This technique is called customization; see [CU89] for details. Also, the compiled methods are placed into a cache from which they can be ushed for various reasons; therefore, they might be recompiled from time to time. Dont be misled by the term compiled method if you are familiar with Smalltalk: in Smalltalk terminology it denotes a method in its bytecode form, but in SELF it denotes the native machine code form. In Smalltalk there is only one compiled method per source method, but in SELF there may be several different compiled methods for the same source method (because of customization).

SELF Source Code disk le SELF Source Code keyboard

RunScript primitive parser read-eval-print loop SELF Objects SELF heap

SELF Methods (objects with bytecodes) SELF heap

when a method not in the cache is called

compiler

Compiled Method (machine code) Compiled method cache

Figure 2 How SELF programs are compiled

Language Reference
This document species SELFs syntax and semantics. An early version of the syntax was presented in the original SELF paper by Ungar and Smith [US87]; this document incorporates subsequent changes to the language. The presentation assumes a basic understanding of object-oriented concepts. The syntax is described using Extended Backus-Naur Form (EBNF). Terminal symbols appear in Courier and are enclosed in single quotes; they should appear in code as written (not including the single quotes). Non-terminal symbols are italicized. The following table describes the metasymbols:
META-SYMBOL ( and ) [ and ] { and } | FUNCTION grouping option repetition alternative production DESCRIPTION used to group syntactic constructions encloses an optional construction encloses a construction that may be repeated zero or more times separates alternative constructions separates the left and right hand sides of a production

A glossary of terms used in this document can be found in Appendix A.

SELF Language Reference

Objects

2 Objects
Objects are the fundamental entities in SELF; every entity in a SELF program is represented by one or more objects. Even control is handled by objects: blocks (2.4.3) are SELF closures used to implement user-defined control structures. An object is composed of a (possibly empty) set of slots and, optionally, code (2.4.1). A slot is a name-value pair; slots contain references to other objects. When a slot is found during a message lookup (5) the object in the slot is evaluated. Although everything is an object in SELF, not all objects serve the same purpose; certain kinds of objects occur frequently enough in specialized roles to merit distinct terminology and syntax. This chapter introduces two kinds of objects, namely data objects (plain objects) and the two kinds of objects with code, ordinary methods and block methods.

2.1 Syntax
Object literals are delimited by parentheses. Within the parentheses, an object description consists of a list of slots delimited by vertical bars (|), followed by the code to be executed when the object is evaluated. For example: ( | slot1. slot2 | here is some code printLine ) Both the slot list and code are optional: ( | | ) and () each denote an empty object. Block objects are written like other objects, except that square brackets ([ and ]) are used in place of parentheses: [ | slot1. slot2 | here is some code in a block printLine ] A slot list consists of a (possibly empty) sequence of slot descriptors (3) separated by periods. A period at the end of the slot list is optional. The code for an object is a sequence of expressions (4) separated by periods. A trailing period is optional. Each expression consists of a series of message sends and literals. The last expression in the code for an object may be preceded by the ^ operator (2.4.4).

2.2 Data objects


Data objects are objects without code. Data objects can have any number of slots. For example, the object () has no slots (i.e., its empty) while the object ( | x = 17. y = 18 | ) has two slots, x and y.
slots x y 17 18

A data object returns itself when evaluated.

If you wish to use the empty vertical bar notation to create an empty object, note that the parser currently requires a space between the vertical bars.

SELF Language Reference

Objects

2.3 The assignment primitive


A slot containing the assignment primitive is called an assignment slot (3.3). When an assignment slot is evaluated, the argument to the message is stored in the corresponding data slot (3) in the same object (the slot whose name is the assignment slots name minus the trailing colon), and the receiver (4) is returned as the result. (Note: this means that the value of an assignment statement is the left-hand side of the assignment statement, not the right-hand side as it is in Smalltalk, C, and many other languages. This is a potential source of confusion for new SELF programmers.)

2.4 Objects with code


The feature that distinguishes a method object from a data object is that it has code, whereas a data object does not. Evaluating a method object does not simply return the object itself, as with simple data objects; rather, its code is executed and the resulting value is returned. 2.4.1 Code Code is a sequence of expressions (4). These expressions are evaluated in order, and the resulting values are discarded except for that of the nal expression, whose value determines the result of evaluating the code. The actual arguments in a message send are evaluated from left to right before the message is sent. For instance, in the message send: 1 to: 5 * i By: 2 * j Do: [|:k | k print ] 1 is evaluated rst, then 5 * i, then 2 * j, and then [|:k | k print]. Finally, the to:By:Do: message is sent. The associativity and precedence of messages is discussed in section 4. 2.4.2 Methods Ordinary methods (or simply methods) are methods that are not embedded in other code. A method can have argument slots (3.4) and/or local slots. An ordinary method always has an implicit parent (3.5) argument slot named self with priority one (the highest possible parent priority, 3.5). Ordinary methods are SELFs equivalent of Smalltalks methods. If a slot contains a method, the following steps are performed when the slot is evaluated as the result of a message send: The method object is cloned, creating a new method activation object containing slots for the methods arguments and locals. The clones self parent slot is initialized to the receiver of the message. The clones argument slots, if any, are initialized to the values of the corresponding actual arguments. The code of the method is executed in the context of this new activation object.

SELF Language Reference

Objects

For example, consider the method ( | :arg | arg * arg ):


slots code :self* :arg arg * arg

This method has an argument slot arg and returns the square of its argument. 2.4.3 Blocks Blocks are SELF closures; they are used to implement user-dened control structures. A block literal (delimited by square brackets) denes two objects: the block method object, containing the blocks code, and an enclosing block data object. The block data object contains a parent pointer (pointing to the object containing the shared behavior for block objects) and a slot containing the block method object. Unlike an ordinary method object, the block method object does not contain a self slot. Instead, it has an anonymous parent slot that is initialized to point to the activation object for the lexically enclosing block or method. As a result, implicit-receiver messages (4.4) sent within a block method are lexically scoped. The block method objects anonymous parent slot is invisible at the SELF level and cannot be accessed explicitly. It has the highest possible parent priority. For example, the block [ 3 + 4 ] looks like:
block traits enclosing methods activation object

parent* (scope) value block (parent*) 3+4 block method

The block methods selector is based on the number of arguments. If the block takes no arguments, the selector is value. If it takes one argument, the selector is value:. If it takes more than two or three arguments, the selector is value:With: or value:With:With:; the selector is just extended by enough With:s to match the number of block arguments. Block evaluation has two phases. In the rst phase, a block object is created because the block is evaluated (e.g., it is used as an argument to a message send). The block is cloned and given a pointer to the activation record for its lexically enclosing scope, the current activation record. In the second phase, the blocks method is evaluated as a result of sending the block the appropriate variant of the value message. The block method is then cloned, the argument slots of the clone

All block objects have the same parent, an object containing the shared behavior for blocks.

SELF Language Reference

Objects

are lled in, the anonymous parent slot of the clone is initialized using the scope pointer determined in phase one, and, nally, the blocks code is executed. It is an error to evaluate a block method after the activation record for its lexically enclosing scope has returned. Such a block is called a non-lifo block because returning from it would violate the last-in, rst-out semantics of activation object invocation.
This restriction is made primarily to allow activation records to be allocated from a stack. A future release of SELF may relax this restriction, at least for blocks that do not access variables in enclosing scopes.

2.4.4 Returns A return is denoted by preceding an expression by the ^ operator. A return causes the value of the given expression to be returned as the result of evaluating the method or block. Only the last expression in an object may be a return. The presence or absence of the ^ operator does not effect the behavior of ordinary methods, since an ordinary method always returns the value of its nal expression anyway. In a block, however, a return causes control to be returned from the ordinary method containing that block, immediately terminating that methods activation, the blocks activation, and all activations in between. Such a return is called a non-local return, since it may return through a number of activations. The result of the ordinary methods evaluation is the value returned by the non-local return. For example, in the following method: assertPositive: x = ( x > 0 ifTrue: [ ^ ok ]. error: non-positive x ) the error: message will not be sent if x is positive because the non-local return of ok causes the assertPositive: method to return immediately.

2.5 Construction of object literals


Object literals are constructed during parsingthe parser converts objects in textual form into real SELF objects. An object literal is constructed as follows: First, the slot initializers of every slot are evaluated from left to right. If a slot initializer contains another object literal, this literal is constructed before the initializer containing it is evaluated. If the initializer is an expression, it is evaluated in the appropriate context. Second, the object is created, and its slots are initialized with the results of the evaluations performed in the first step. Slot initializers are not evaluated in the lexical context, since none exists at parse time; they are evaluated in the context of an object known as the lobby. That is, the initializers are evaluated as if they were the code of a method in a slot of the lobby. This two-phase object construction process implies that slot initializers may not refer to any other slots within the constructed object (as with Schemes let* and letrec forms) and, more generally, that a slot initializer may not refer to any textually enclosing object literal.

SELF Language Reference

Slot descriptors

3 Slot descriptors
An object can have any number of slots. Slots can contain data (data slots) or methods. Some slots have special roles: argument slots are lled in with the actual arguments during a message send (4.3), and parent slots specify inheritance relationships (5.2). A slot descriptor consists of an optional privacy specication, followed by the slot name and an optional initializer.

3.1 Slot privacy


Slot privacy delimits the public and private interfaces to an object. Public slots may be accessed by message sends anywhere in the system, while private slots are accessible only within the same abstraction. (The semantics of privacy declarations are dened in 5.3 and explained in [CUC91].) Slots without explicitly declared privacy are treated as public slots. This permits privacy declarations to be omitted during exploratory development phases and added later when the interfaces have become more stable. Slot privacy syntax is summarized in the table below; ^_ and _^ can only be used for assignable slots. (Slots names may not begin with an underscore (_); see 6.2.)
SYNTAX none ^ _ ^_ _^ DATA SLOT public public private public private ASSIGNMENT SLOT public public private private public

3.2 Read-only slots


A slot name followed by an equals sign (=) and an expression represents a read-only variable initialized to the result of evaluating the expression in the root context. For example, a constant point might be dened as: ( | parent* = traits point. x = 3 + 4. y = 5.

| )

For slots that are not part of the public interface but cannot be made private due to the semantics for slot privacy, the convention is to leave the slot formally unmarked (so that it behaves as a public slot), but decorated with the comment "_" as an indication that it is not intended to be part of the public interface.

SELF Language Reference

Slot descriptors

The resulting point contains three initialized read-only slots:


point traits parent* x y

7 5

3.3 Read/write slots


There is no separate assignment operation in SELF. Instead, assignments to data slots are message sends that invoke the assignment primitive. For example, a data slot x is assignable if and only if there is a slot in the same object with the same name appended with a colon (in this case, x:), containing the assignment primitive. Therefore, assigning 17 to slot x consists of sending the message x: 17. Since this is indistinguishable from a message send that invokes a method, clients do not need to know if x and x: comprise data slot accesses or method invocations. An identier followed by a left arrow (the characters < and - concatenated to form <-) and an expression represents an initialized read/write variable (assignable data slot). The object will contain both a data slot of that name and a corresponding assignment slot whose name is obtained by appending a colon to the data slot name. The initializing expression is evaluated in the root context and the result stored into the data slot at parse time. For example, an initialized mutable point might be dened as: ( | parent* = traits point. x <- 3 + 4. y <- 5.

| ) producing an object with two data slots (x and y) and two assignment slots (x: and y:) containing the assignment primitive (depicted with ):
point traits parent* x x: y y: 5

An identier by itself species an assignable data slot initialized to nil. Thus, the slot declaration x is a shorthand notation for x <- nil. For example, a simple mutable point might be dened as: ( | x. y. | )

Nil is a predened object provided by the implementation. It is intended to indicate not a useful object.

10

SELF Language Reference

Slot descriptors

producing:
x x: y y: nil

3.4 Slots containing methods


If the initializing expression is an object literal with code, that object is stored into the slot without evaluating the code. This allows a slot to be initialized to a method by storing the method itself, rather than its result, in the slot. Methods may only be stored in read-only slots. A method automatically receives a parent argument slot named self with parent priority one (the highest possible priority). For example, a point addition method can be written as: ( | + = ( | :arg | (clone x: x + arg x) y: y + arg y ). | ) producing:
:self* + :arg (clone x: x + arg x) y: y + arg y

A slot name beginning with a colon indicates an argument slot. The prexed colon is not part of the slot name and is ignored when matching the name against a message. Argument slots are always read-only, and no initializer may be specied for them. As a syntactic convenience, the argument name may also be written immediately after the slot name (without the prexed colon), thereby implicitly declaring the argument slot. Thus, the following yields exactly the same object as above: ( | + arg = ( (clone x: x + arg x) y: y + arg y ). | ) The + slot above is a binary slot (4.2), taking one argument and having a name that consists of operator symbols. Slots like x or y in a point object are unary slots (4.1), which take no arguments and have simple identiers for names. In addition, there are keyword slots (4.3), which handle messages that require one or more arguments. A keyword slot name is a sequence of identiers, each followed with a colon. The arguments in keyword methods are handled analogously to those in binary methods: each colon-terminated identier in a keyword slot name requires a corresponding argument slot in the key-

Although a block may be stored into a slot, it is not useful to do so: evaluating the slot will result in an error because the activation record for the blocks lexically enclosing scope will have returned; see 2.4.3.

11

SELF Language Reference

Slot descriptors

word method object, and the argument slots may be specied either all in the method or all interspersed with the selector parts. For example: ( | ifTrue: False: = ( | :trueBlock. :falseBlock | trueBlock value ). | ) and ( | ifTrue: trueBlock False: falseBlock = ( trueBlock value ). | ) produce identical objects.

3.5 Parent slots


A unary slot name followed by one or more asterisks denotes a parent slot. The number of asterisks indicates the priority level of the parent, with one asterisk being the highest priority. The trailing asterisks are not part of the slot name and are ignored when matching the name against a message. Except for their special meaning during the message lookup process (5.2), parent slots are exactly like normal unary slots; in particular, they may be assignable, allowing dynamic inheritance. Argument slots cannot be parent slots.

12

SELF Language Reference

Expressions

4 Expressions
Expressions in SELF are messages sent to some object, the receiver. SELF message syntax is similar to Smalltalks. SELF provides three basic kinds of messages: unary messages, binary messages, and keyword messages. Each has its own syntax, associativity, and precedence. Each type of message can be sent either to an explicit or implicit receiver. Productions: expression constant unary-message unary-send binary-message binary-send keyword-message keyword-send receiver resend constant | unary-message | binary-message | keyword-message | ( expression ) self | number | string | object receiver unary-send | resend . unary-send identifier receiver binary-send | resend . binary-send operator expression receiver keyword-send | resend . keyword-send small-keyword expression { cap-keyword expression } [ expression ] resend | identifier

The table below summarizes SELFs message syntax rules:


MESSAGE unary binary keyword ARGUMENTS 0 1 1 PRECEDENCE highest medium lowest ASSOCIATIVITY none none or left-to-right * right-to-left SYNTAX [receiver] identifier [receiver] operator expression [receiver] small-keyword expression { cap-keyword expression }

* Heterogeneous binary messages have no associativity; homogeneous binary messages associate left-to-right.

Parentheses can be used to explicitly specify order of evaluation.

4.1 Unary messages


A unary message does not specify any arguments other than its receiver. It is written as an identier following the receiver.

In order to simplify the presentation, this grammar is ambiguous; precedence and associativity rules are used to resolve the ambiguities.

13

SELF Language Reference

Expressions

Examples of unary messages sent to explicit receivers: 17 print 5 factorial Associativity. Unary messages compose from left to right. An expression to print 5 factorial, for example, is written: 5 factorial print and interpreted as: (5 factorial) print Precedence. Unary messages have higher precedence than binary messages and keyword messages.

4.2 Binary messages


A binary message has a receiver and a single argument, separated by a binary operator. Examples of binary messages: 3 + 4 7 <-> 8 Associativity. Binary messages have no associativity, except between identical operators (which associate from left to right). For example, 3 + 4 + 7 is interpreted as (3 + 4) + 7 But 3 + 4 * 7 is illegal: the associativity must be made explicit by writing either (3 + 4) * 7 or 3 + (4 * 7). Precedence. The precedence of binary messages is lower than unary messages but higher than keyword messages. All binary messages have the same precedence. For example, 3 factorial + pi sine is interpreted as (3 factorial) + (pi sine)

4.3 Keyword messages


A keyword message has a receiver and one or more arguments. It is written as a receiver followed by a sequence of one or more keyword-argument pairs. The rst keyword must begin with a lower case letter or underscore (_); subsequent keywords must be capitalized. A keyword message consists of the longest possible sequence of such keyword-argument pairs; the message selector is the

14

SELF Language Reference

Expressions

concatenation of the keywords forming the message. Message selectors beginning with an underscore are reserved for primitives (5.1). Example: 5 min: 4 Max: 7 is the single message min:Max: sent to 5 with arguments 4 and 7, whereas 5 min: 4 max: 7 involves two messages: rst the message max: sent to 4 and taking 7 as its argument, and then the message min: sent to 5, taking the result of (4 max: 7) as its argument. Associativity. Keyword messages associate from right to left, so 5 min: 6 min: 7 Max: 8 Max: 9 min: 10 Max: 11 is interpreted as 5 min: (6 min: 7 Max: 8 Max: (9 min: 10 Max: 11)) The association order and capitalization requirements are intended to reduce the number of parentheses necessary in SELF code. For example, taking the minimum of two slots m and n and storing the result into a data slot i may be written as i: m min: n Precedence. Keyword messages have the lowest precedence. For example, i: 5 factorial + pi sine is interpreted as i: ((5 factorial) + (pi sine))

4.4 Implicit-receiver messages


Unary, binary, and keyword messages are frequently written without an explicit receiver. Such messages use the current receiver (self) as the implied receiver. The method lookup, however, begins at the current activation object rather than the current receiver (see 2.4 for details on activation objects). Thus, a message sent explicitly to self is not equivalent to an implicit-receiver send because the former wont search local slots before searching the receiver. Sending messages to explicit self is considered bad style. Examples: factorial + 3 max: 5 1 + power: 3 (implicit-receiver unary message) (implicit-receiver binary message) (implicit-receiver keyword message) (parsed as 1 + (power: 3))

Accesses to a local data slot are also message sends to self. For an assignable data slot named t, the message t returns the contents, and t: 17 puts 17 into the slot.

15

SELF Language Reference

Expressions

4.5 Resending messages


A resend allows a method to invoke the method that the rst method (the one that invokes the resend) is overriding. Directed resends allow ambiguities among overridden methods to be resolved by constraining the lookup to search a single parent slot. Both resends and directed resends may change the name of the message being sent from the name of the current method, and may pass different arguments than the arguments passed to the current method. The receiver of a resend or a directed resend must be the implicit receiver. Intuitively, resend is similar to Smalltalks super send and CLOS call-next-method. A resend is written as an implicit-receiver message with the reserved word resend, a period, and the message name. No whitespace may separate resend, the period, and the message name. Examples: resend.display resend.+ 5 resend.min: 17 Max: 23 A directed resend constrains the resend through a specied parent. It is written similar to a normal resend, but replaces resend with the name of the parent slot through which the resend is directed. Examples: listParent.height intParent.min: 17 Max: 23 Only implicit-receiver messages may be delegated via a resend or a directed resend.

General delegation for explicit receiver messages is supported through primitives in the implementation (see Appendix H).

16

SELF Language Reference

Message lookup semantics

5 Message lookup semantics


This section describes the semantics of message lookups in SELF. In addition to an informal textual description, the lookup semantics are presented in pseudo-code using the following notation: s.name s.contents s.isParent s.priority s.holder s.isPublic h.sel h.holder h.invokedByDirectedResend h.delegatee {s |S| o1 * o2 obj | pred(s)} The name of slot s. The object contained in slot s. True iff s is a parent slot. The parent slot priority of parent slot s (equivalent to the number of asterisks following the slot name). The object containing slot s. True iff s is a public slot or a slot with unspecified privacy. The name of the slot holding the prototype of activation object h. The object containing the prototype of activation object h. True iff activation object h was invoked by a directed resend. The delegatee directed to if activation object h was invoked by a directed resend. The set of all slots of object obj that satisfy predicate pred. The cardinality of set S. True iff o2 is (possibly trivially) an ancestor of o1, i.e. if there exists at least one inheritance path from o1 to o2, or if o1 = o2. The message send function (5.1). The lookup algorithm (5.2). The privacy function (5.3). The message resend function (5.4). The lookup algorithm for resends (5.4). The slot evaluation function as described informally throughout 2.

The message sending semantics are decomposed into the following functions: send(rec, sel, smh, args) lookup(obj, rec, sel, smh, V) isVisible(rec, smh, obj) resend(...) resend_lookup(...) eval(rec, M, args)

See [CUC91] for more background on SELFs inheritance semantics.

5.1 Message send


There are two kinds of message sends: a primitive send has a selector beginning with an underscore (_) and calls the corresponding primitive operation. Primitives are predened functions provided by the implementation. A normal send does a lookup to obtain the target slot; if the lookup was successful, the slot is subsequently evaluated. If the slot contains a data object, then the data object is simply returned. If the slot contains the assignment primitive, the argument of the

17

SELF Language Reference

Message lookup semantics

message is stored in the corresponding data slot. Finally, if the slot contains a method, an activation is created and run as described in 2.4.2. If the lookup fails, the lookup error is handled in an implementation-dened manner; typically, a message indicating the type of error is sent to the receiver of the original message. The function send(rec, sel, smh, args) is dened as follows: Input: rec, sel, smh, args, res, the receiver of the message the message selector the sending method holder the actual arguments the result object

Output: Algorithm:

if begins_with_underscore(sel) then invoke_primitive(rec, smh, sel, args) else M lookup(rec, rec, sel, smh, ) case | M | = 0: error: message not understood eval(rec, M, args) | M | = 1: res | M | > 1: error: ambiguous message send end end return res

primitive call do the lookup

see 2

5.2 The lookup algorithm


The lookup algorithm recursively traverses the inheritance graph, which can be an arbitrary graph (including cyclic graphs). No object is searched twice along any single path. The search begins in the object itself and then continues to search every parent group (every set of parent slots having the same priority) until the set of matching slots is non-empty or all parent slots have been exhausted. Parent slots are not evaluated during the lookup. That is, if a parent slot contains an object with code, the code will not be executed; the object will merely be searched for matching slots. The function lookup(obj, rec, sel, smh, V) is defined as follows: Input: obj, rec, sel, smh, V, M, the object being searched for matching slots the receiver of the message the message selector the sending method holder the set of objects already visited along this path the set of matching slots of highest priority

Output: Algorithm:

if obj V cycle detection then M else M {s obj | s.name = sel and isVisible(rec, smh, s)} try local slots if M = then M parent_lookup(obj, rec, sel, smh, V) end try parent slots

18

SELF Language Reference

Message lookup semantics

end return M Where parent_lookup(obj, rec, sel, smh, V) is dened as follows: M 1 prio V {obj} V while M = and {s obj | s.isParent and s.priority {s obj | s.priority = prio} P M lookup(s.contents, rec, sel, smh, V) prio end return M
s

prio}

do
parent group recursively search parents

prio + 1

5.3 Privacy
A private slot is accessible if both the sending method holder and the private slot holder are ancestors of the receiver. Inaccessible private slots are ignored during the lookup. Thus, the function isVisible(rec, smh, s) is dened as follows: Input: rec, smh, s, the receiver of the message the sending method holder a matching slot

Output: Algorithm:

True iff s is visible.

return s.isPublic or (rec * smh and rec *s.holder)

5.4 Resend
A resend consists of redoing the lookup, beginning with the most recent non-resend lookup on the call stack, skipping over lookup matches up to and including the match containing the method performing the resend, and returning the next matching slot as the result of the resend. This complexity and history-sensitivity is caused by the interactions among resends, prioritized multiple inheritance, and dynamic inheritance. The activation call stack is the ordered list of suspended activation objects, beginning with the current activation object and continuing with the current activation objects calling activation object, and so on. The resend call chain is the prex of the activation call stack up to and including the rst activation object that was not invoked by a resend or a directed resend; this non-resend activation object is called the base activation object. A resend or directed resend begins by performing the same lookup that invoked the base activation object; thus, the sending method holder and the receiver of the resend is the same as the sending

The semantics of resend will be simplied in the future.

19

SELF Language Reference

Message lookup semantics

method holder and the receiver used to invoke the base activation object. The normal lookup function is extended to return a list of sets of matching slots, such that elements of one set override elements of successive sets. The function resend(rec, sel, smh, args, isDir, del, A) is dened as follows: Input: rec, sel, isDir, del, smh, args, A, res, the receiver of the message the message selector true if this is a directed resend the parent to which the resend is directed (the delegatee) the sending method holder the actual arguments the list of current activation objects the result object

Output: Algorithm:

resend_call_chain_prex(A) resend call chain H base last_element(H) base activation object M resend_lookup(rec, rec, base.sel, sel, isDir, del, base.holder, , H) do the lookup m first_element(M) while | m | 0 do if | m | > 1 then error: ambiguous message send end where m = first_element(m) if | m | = 1 and (m.holder = smh and m.name = sel) then f following(M, m) case | f | = 0: error: message not understood | f | > 1: error: ambiguous message send eval(rec, f, args); return res see 2 | f | = 1: res end end following(M, m) m end error: resending method not found The function resend_lookup(obj, rec, sel, selc, isDirc, delc, smh, V, H) is defined as follows: Input: obj, rec, sel, selc, isDirc, delc, smh, V, H, M, the object being searched for matching slots the receiver of the message the message selector the selector being resent true if it is a directed resend the parent to which the resend is directed (the delegatee) the sending method holder (from the base activation object) the set of all objects already visited along this path the set of activation objects in the resend call chain the list of sets of matching slots

Output: Algorithm:

20

SELF Language Reference

Message lookup semantics

if obj V cycle detection then M else Mlocal {s obj | s.name = sel and isVisible(rec, smh, s)} local matches Mparent 1 prio V {obj} V while {s obj | s.isParent and s.priority prio} do {s obj | s.isParent and s.priority = prio} P a resend done before if {h H | h.holder = obj and h.sel = sel} then cur (h H | h.holder = obj and h.sel = sel) a single activation object if cur first_element(H) preceding(H, cur) callee is called by cur then callee sel callee.sel if callee.invokedByDirectedResend {s P | s = callee.delegatee} end directed resend then P if | P | = 0 then error: delegatee not found end selc else sel if isDirc then P {s P | s = delc} end directed resend if | P | = 0 then error: delegatee not found end end end resend_lookup(s.contents, rec, sel, selc, isDirc, delc, smh, V, H) Mparent
s

prio

prio + 1

end if Mlocal = Mparent then M else M Mlocal || Mparent end end return M

concatenation of set onto list of sets

The union operator over lists of sets of matching slots is dened to return the list of the element sets. The algorithm is similar to the regular lookup algorithm, except it does not stop the lookup once the highest priority matching slots are found. Instead, it recursively searches each objects parents in priority order (beginning with the base activation objects method holder) to construct a list of sets of matching slots. This list represents the order of slots in ancestors of the receiver that match the message at the time of the sending of the resend that is being handled (the presence of dynamic inheritance means that this order may change from message send to message send). Slots in the same set are of equal order. During this process, the algorithm prunes searches through parents that are not delegated to at places in the resend chain where directed resends were performed. In addition, if resends along the resend chain changed the selector name, the new selector name is searched for instead. In essence, the resend lookup algorithm mimics the chain of resend lookups as best it can. Resends that have been spliced out of the chain by inheritance graph changes are ignored. Objects with matching

21

SELF Language Reference

Message lookup semantics

slots that have been inserted into the path of the resend chain are assumed to do a regular (non-directed, same selector) resend, and the lookup continues. Once the list of sets is computed, the sets are searched in order for either a non-singleton set or the set containing the slot for the resending method. The following cases can occur: If a non-singleton set is found, then the resend results in an ambiguous message error. (This may occur if dynamic inheritance changes the inheritance graph to introduce ambiguity in the lookup of the message.) If no set contains a matching slot for the resending method, then the resend results in a resending method not found error. (This may occur if dynamic inheritance removes the resending method holder from all possible lookup paths.) Otherwise, the resending method is found, and the result of the resend is determined by the set following the one in which the resending method is found. There are three possible cases at this point: If the following set is empty (the set containing the resending method is the last set in the sequence), then the resend results in a message not understood error. If the following set is non-singleton, then the resend results in an ambiguous message error. Otherwise, the resend is successful, and the slot contained in the following set is evaluated to compute the result of the resend.

22

SELF Language Reference

Lexical elements

6 Lexical elements
This chapter describes the lexical structure of SELF programshow sequences of characters in SELF source code are grouped into lexical tokens. In contrast to syntactic elements described by productions in the rest of this document, the elements of lexical EBNF productions may not be separated by whitespace, i.e. there may not be whitespace within a lexical token. Tokens are formed from the longest sequence of characters possible. Whitespace may separate any two tokens and must separate tokens that would be treated as one token otherwise.

6.1 Character set


SELF programs are written using the following characters: Letters. The fifty-two upper and lower case letters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz Digits. The ten numeric digits: 0123456789 Whitespace. The formatting characters: space, horizontal tab (ASCII HT), newline (NL), carriage return (CR), vertical tab (VT), backspace (BS), and form feed (FF). (Comments are also treated as whitespace.) Graphic characters. The 32 non-alphanumeric characters: !@#$%^&*()_-+=|\~{}[]:;"<>,.?/

6.2 Identiers
An identier is a sequence of letters, digits, and underscores (_) beginning with a lowercase letter or an underscore. Case is signicant: apoint is not the same as aPoint. Productions: small-letter cap-letter letter identifier Examples: i a | b | ... | z A | B | ... | Z small-letter | cap-letter (small-letter | _) {letter | digit | _} _IntAdd cloud9 m a_point

The two identiers self and resend are reserved. Identiers beginning with underscores are reserved for primitives.

6.3 Keywords
Keywords are used as slot names and as message names. They consist of an identier or a capitalized identier followed by a colon (:). Productions: small-keyword identifier :

23

SELF Language Reference

Lexical elements

cap-keyword Examples: at:

cap-letter {letter | digit | _} : Put: _IntAdd:

6.4 Arguments
A colon followed by an identier denotes an argument slot name. Production: arg-name Example: :name : identifier

6.5 Operators
An operator consists of a sequence of one or more of the following characters: ! @ # $ % ^ & * - + = ~ / ? < > , ; | \ Two sequences are reserved and are not operators: | Productions: op-char operator Examples: + ! | @ | # | $ | % | ^ | & | * | - | + | = | ~ | / | ? | < | > | , | ; | | | | \ op-char {op-char} && || <-> %#@^ ^

6.6 Numbers
Integer literals are written as a sequence of digits, optionally prexed with a minus sign and/or a base. No whitespace is allowed between a minus sign and the digit sequence. Real constants may be either written in xed-point or exponential form. Integers may be written using bases from 2 to 36. For bases greater than ten, the characters a through z (case insensitive) represent digit values 10 through 35. The default base is decimal. A non-decimal number is prexed by its base value, specied as a decimal number followed by either r or R. Real numbers may be written in decimal only. The exponent of a oating-point format number indicates multiplication of the mantissa by 10 raised to the exponent power; i.e., nnnnEddd = nnnn 10ddd

In situations where parsing the minus sign as part of the number would cause a parse error (for example, in the expression a-1), the minus is interpreted as a binary message (a - 1).

24

SELF Language Reference

Lexical elements

A number with a digit that is not appropriate for the base will cause a lexical error, as will an integer constant that is too large to be represented. If the absolute value of a real constant is too large or too small to be represented, the value of the constant will be innity or zero, respectively. Productions: number integer real fixed-point float general-digit decimal base Examples: 123 [ - ] (integer | real) [base] general-digit {general-digit} fixed-point | float decimal . decimal decimal [ . decimal ] (e | E) [ + | - ] decimal digit | letter digit {digit} decimal (r | R) 16r27fe -1272.34e+15 1e10

6.7 Strings
String constants are enclosed in single quotes (). With the exception of single quotes and escape sequences introduced by a backslash (\), all characters (including formatting characters like newline and carriage return) lying between the delimiting single quotes are included in the string. To allow single quotes to appear in a string and to allow non-printing control characters in a string to be indicated more visibly, SELF provides C-like escape sequences: \t \f \a \ tab form feed alert (bell) single quote \b \r \0 \" backspace carriage return null character double quote \n \v \\ \? newline vertical tab backslash question mark

A backslash followed by an x, d, or o species the character with the corresponding numeric encoding in the ASCII character set: \xnn \dnnn \onnn hexadecimal escape decimal escape octal escape

There must be exactly two hexadecimal digits for hexadecimal character escapes, and exactly three digits for decimal and octal character escapes. Illegal hexadecimal, decimal, and octal numbers, as well as character escapes specifying ASCII values greater than 255 will cause a lexical error. For example, the following characters all denote the carriage return character (ASCII code 13): \r \x0d \d013 \o015 A long string may be broken into multiple lines by preceding each newline with a backslash. Such escaped newlines are ignored during formation of the string constant.

25

SELF Language Reference

Lexical elements

A backslash followed by any other character than those listed above will cause a lexical error. Productions: string normal-char escape-char numeric-escape { normal-char | escape-char } any character except \ and \t | \b | \n | \f | \r | \v | \a | \0 | \\ | \ | \" | \? | numeric-escape \x general-digit general-digit | ( \d | \o ) digit digit digit

6.8 Comments
Comments are delimited by double quotes ("). Double quotes may not themselves be embedded in the body of a comment. All characters (including formatting characters like newline and carriage return) are part of the body of a comment. Productions: comment comment-char Example: " { comment-char } " any character but "

"this is a comment"

26

SELF Language Reference

Glossary

Appendix A Glossary
A slot is a name-value pair. The value of a slot is often called its contents. An object is composed of a (possibly empty) set of slots and, optionally, a series of expressions called code. The SELF implementation provides objects with indexable slots (vectors) via a set of primitives. A data object is an object without code. A data slot is a slot holding a data object. An assignment slot is a slot containing the assignment primitive. An assignable data slot is a data slot for which there is a corresponding assignment slot whose name consists of the data slots name followed by a colon. When an assignment slot is evaluated its argument is stored in the corresponding data slot. An ordinary method (or simply method) is an object with code and is stored as the contents of a slot. The methods name (also called its selector) is the name of the slot in which it is stored. A block is an object representing a lexically-scoped closure (similar to a Smalltalk block). A block method is the method that is executed when a block is evaluated by sending it value, value:, value:With:, etc. A block method is a special kind of method that is evaluated within the scope of its method and any lexically enclosing blocks. An activation object records the state of an executing method or block method. It is a clone of the method prototype used to store the methods arguments and local slots during execution. There are two kinds of activation objects: ordinary method activation objects (or simply method activation objects) and block method activation objects. A non-lifo block is a block that is evaluated after the activation for its lexically enclosing block or method has returned. This results in an error in the current implementation. A non-local return is a return from a method activation resulting from performing a return (i.e., evaluating an expression preceded by the ^ operator) from within a lexically enclosed block. A non-local return forces returns from all activations between the method activation and activation of the block performing the return. The method holder of a method is the object containing the slot holding that method. The sending method holder of a message is the method holder of the method that sent it. A message is a request to an object to perform some operation. The object to which the request is sent is called the receiver. A message send is the action of sending a message to a receiver. A primitive send is a message handled by invoking a primitive, a predefined function provided by the SELF implementation. Messages that do not have an explicit receiver are known as implicit-receiver messages. The receiver is bound to self. A unary message is a message consisting of a single identifier sent to a receiver. A binary message is a message consisting of an operator and a single argument sent to a receiver. A keyword message is a message consisting of one or more identifiers with trailing colons, each followed by an argument, sent to a receiver.

27

SELF Language Reference

Glossary

Unary, binary, and keyword slots are slots with selectors that match unary, binary, and keyword messages, respectively. An argument slot is a slot in a method filled in with a value when the method is invoked. Message lookup is the process by which objects determine how to respond to a message (which slot to evaluate), by searching objects for slots matching the message. Slot privacy determines if a slot is visible for a particular message lookup. Public slots are always visible; private slots are only visible if both the sending method holder and the private slot holder are ancestors of the receiver. Inheritance is the mechanism by which message lookup searches objects for slots when the receivers slots are exhausted. An objects parent slots contain objects that it inherits from. The parent priority of a parent slot indicates the order in which parents are searched during a message lookup. Prioritized inheritance implements both unordered multiple inheritance (equal priority parents) and ordered multiple inheritance (unequal priority parents). Dynamic inheritance is the modification of object behavior by setting an assignable parent slot. A resend allows a method to invoke the method that the first method (the one that invokes the resend) is overriding. A directed resend constrains the lookup to search a single parent slot. The activation call stack is the ordered list of suspended activation objects, with the head of the list being the current activation object. The resend call chain is the prefix of the activation call stack up to and including the first activation object that was not invoked by a resend or a directed resend; this non-resend activation object is called the base activation object. Cloning is the primitive operation returning an exact shallow copy (a clone) of an object, i.e. a new object containing exactly the same slots and code as the original object. A prototype is an object that is used as a template from which new objects are cloned. A traits object is a parent object containing shared behavior, playing a role somewhat similar to a class in a class-based system (see [UCC91]). Any SELF implementation is required to provide traits objects for integers, floats, strings, and blocks (i.e. one object which is the parent of all integers, another object for floats, etc.). The root context is the object that provides the context (i.e., set of bindings) in which slot initializers are evaluated. This object is known as the lobby. During slot initialization, self is bound to the lobby. The lobby is also the sending method holder for any sends in the initializing expression. Nil is the object used to initialize slots without explicit initializers. It is intended to indicate not a useful object. This object is provided by the SELF implementation.

28

SELF Language Reference

Lexical overview

Appendix B Lexical overview


small-letter cap-letter letter identifier small-keyword cap-keyword argument-name op-char operator number integer real fixed-point float general-digit decimal base string normal-char escape-char numeric-escape comment comment-char a | b | ... | z A | B | ... | Z small-letter | cap-letter (small-letter | _) {letter | digit | _} identifier : cap-letter {letter | digit | _} : : identifier ! | @ | # | $ | % | ^ | & | * | - | + | = | ~ | / | ? | < | > | , | ; | | | | \ op-char {op-char} [ - ] (integer | real) [base] general-digit {general-digit} fixed-point | float decimal . decimal decimal [ . decimal ] (e | E) [ + | - ] decimal digit | letter digit {digit} decimal (r | R) { normal-char | escape-char } any character except \ and \t | \b | \n | \f | \r | \v | \a | \0 | \\ | \ | \" | \? | numeric-escape \x general-digit general-digit | ( \d | \o ) digit digit digit " { comment-char } " any character but "

29

SELF Language Reference

Syntax overview

Appendix C Syntax overview


expression constant unary-message unary-send binary-message binary-send keyword-message keyword-send receiver resend object regular-object block body slot-list code slot slottype privacy public private public-read-only public-write-only arg-slot

constant | unary-message | binary-message | keyword-message | ( expression ) self | number | string | object receiver unary-send | resend . unary-send identifier receiver binary-send | resend . binary-send operator expression receiver keyword-send | resend . keyword-send small-keyword expression { cap-keyword expression } [ expression ] resend | identifier regular-object | block ( body ) [ body ] [ slot-list ] [ code ] | [ slot { . slot } [ . ] ] | { expression . } [ ^ ] expression [ . ] [ privacy ] slot-type arg-slot | data-slot | binary-slot | keyword-slot public | private | public-read-only | public-write-only ^ _ ^_ _^ argument-name

In order to simplify the presentation, this grammar is ambiguous; precedence and associativity rules are used to resolve the ambiguities.

30

SELF Language Reference

Syntax overview

data-slot

slot-name | slot-name <- expression | slot-name = expression slot-name = regular-object operator = regular-object | operator identifier = regular-object small-keyword {cap-keyword} = regular-object | small-keyword identifier {cap-keyword identifier} = regular-object identifier | parent-name identifier * {*}

unary-slot binary-slot keyword-slot slot-name parent-name

31

SELF Language Reference

Built-in types

Appendix D Built-in types


There are a small number of built-in types that are directly supported through primitives. Integers and oats are provided with primitives for performing arithmetic operations, comparisons etc. (see Table 13 and Table 14 in Appendix H). Vectors and byte vectors are objects with indexable slots. A vector can store any object in its slots, whereas the slots in a byte vector are restricted to hold integers in the range 0-255 only. There are primitives to access the slots of vectors and byte vectors (see Table 12 in Appendix H). Strings have a byte vector part for storing the characters. Special string primitives are provided (see Table 15 in Appendix H). Mirrors are objects which present a view on other objects, providing reection facilities in SELF (see Table 17 and Table 18 in Appendix H). Processes are objects which contain methods for controlling SELF processes, e.g. terminating them or printing their stack (see Table 20 in Appendix H).

32

The SELF World

The SELF World


The default SELF world is a set of useful objects, including objects that can be used in application programs (e.g., integers, strings, and collections), objects that support the programming environment (e.g., the debugger), and objects that simply are used to organize the other objects. This document describes how this world is organized, focusing primarily on those objects meant for use in SELF programs. It does not discuss the objects used to implement system facilitiesfor example, there is no discussion of the objects used to implement the graphical user interfacenor does it discuss how to use programming support objects such as the command history object; such tools are described in The SELF Users Manual. The reader is assumed to be acquainted with the SELF language, the use of multiple inheritance, the use of traits objects and prototype objects, and the organizing principles of the SELF world as discussed in [UCC91].

33

The SELF World

World Organization

7 World Organization
7.1 The Lobby
The lobby object is thus named because it is where objects enter the SELF world: when a script that creates a new object is read into the system, all expressions in that script are evaluated in the context of the lobby. That is, the lobby is the receiver of all messages sent to self by expressions in the script. To refer to some exisiting object in a script, the object must be accessible by sending a message to the lobby. For example, the expression:
_AddSlots: ( | newObject = ( | entries <- list copy ... | ) | )

requires that the message list be understood by the lobby (the implicit receiver of the message) so that the entries slot of the new object can be initialized. The lobby slots prototypes, traits, oddballs, and mixins are the roots of the object namespaces accessible from the lobby. The organization of these namespaces is described in the next section. The slot lobby allows the lobby itself to be referred by name. lobby
prototypes* traits oddballs* mixins lobby defaultBehavior* printIt doIt shell comment ... ... ... ... ... ... ... ... ...

The lobby also has a number of other functions: it is the location of the default behavior inherited by most objects in the system (slot defaultBehavior); it is used to evaluate expressions typed to the interactive prompt (slots printIt, doIt, and shell); and, nally, it has a slot named comment that contains a short description of itself. (Many objects in the system have comment slots.)

34

The SELF World

World Organization

7.2 Names and Paths


For convenience, the lobbys namespace is broken into four pieces, implemented as separate objects rooted at the lobby: prototypes traits concrete objects meant to be copied and used objects that encapsulate shared behavior. Typically, each prototype object has an associated traits object of the same name that describes the shared part of its behavior. well-known, unique objects such as true small, parentless bundles of behavior designed to be mixed into some other object

oddballs mixins

Each of these namespace objects (or categories) is further divided and subdivided to aid navigation. For example, to nd the prototype list object, one could start with the prototypes slot of the lobby, then get the collections slot of that object, then get the ordered slot of that object, and nally get the list slot of that object. The sequence of slot names, lobby prototypes collections ordered list, is called a path and constitutes the list prototypes full name. To shorten object names, many of the slots in the namespace hierarchy are made parent slots. Parent slots can be omitted from an objects full name, since the slots in a parent are visible in the child via inheritance. A path with parent slots omitted forms the short name for an object. For example, the short name for the list prototype is simply list. Non-parent slots are used when it is desirable to keep a part of the name space distinct. For example, the traits slot of the lobby is not a parent slot. This allows a convention that gives prototypes and their associated traits objects similar names: a prototype and its associated traits object have the same local name, but the prototype is placed in an ancestor of the prototypes slot of the lobby, whereas the traits object is placed in an ancestor of the traits slot. Since the traits slot of the lobby is not a parent slot, the name of the traits object must start with the prex traits. The prototypes slot, on the other hand, is a parent slot, so the name of a prototype object needs no prex. Thus, list refers to the prototype list while traits list refers to its traits object for lists. As a matter of style, programs should refer to objects by the shortest possible name. This makes it easier to re-organize the global namespace as the system evolves. (If programs used full path names, then many more names would have to be updated to reect changes to the namespace organization, a tedious chore.) The initial structure of the namespace accessible from the lobby (dened in the le init.self) is shown on the next page. This structure can be extended by adding slots for additional categories and subcategories as necessary.

35

The SELF World

The Roots of Behavior

lobby
prototypes* traits oddballs* mixins lobby defaultBehavior* printIt doIt shell comment

... ... ... ... ...

userInterface* system* pixrects* io* graphics* collections* bench* applications* mirrors slots xlib

... ... ... ... ... ... ... ... ... ... ...

userInterface* system* pixrects* io* graphics* collections* bench* applications* mirrors slots xlib

... ... ... ... ... ... ... ... ... ... ...

userInterface* io* standard* system

... ... ... ...

userInterface* system* graphics* bench* applications* preferences

... ... ... ... ... ...

(Snap grid: 2 pt or 3 pt (text), Grid Lines: 1/4 in)

8 The Roots of Behavior


8.1 Default Behavior
Certain common behavior is shared by nearly all objects in the SELF world. This basic behavior is dened in the defaultBehavior slot of the lobby and includes: identity comparisons (== and !==) normal comparisons (<, <=, =, >=, >, !=) three-way comparison (compare:IfLess:Equal:Greater:) default behavior for printing mirror creation (reflect:) support for point, extent, and list construction ability to assign history numbers to objects to allow programmers to easily refer to them behavior that allows blocks to ignore extra arguments

36

The SELF World

The Roots of Behavior

behavior that allows an object to behave like a block that evaluates to that object (this permits a non-block object to be passed to a method that expects a block) behavior that allows an object to be its own key in a collection default behavior for doubly-dispatched messages behavior for printing error messages and stack dumps It is important to note that not all objects in the system inherit this default behavior. It is entirely permissible to construct objects that do not inherit from the lobby, and the SELF world contains quite a few such objects. For example, the objects used to break a namespace into separate categories typically do not inherit from the lobby. Any program intended to operate on arbitrary objects, such as a debugger, must therefore assume that the objects it manipulates do not understand even the messages in defaultBehavior. Files: defaultBehavior.self, errorHandling.self

8.2 The Root Traits: Traits Clonable and Traits Oddball


Most concrete objects in the SELF world are descendants of one of two top-level traits objects: traits clonable and traits oddball. The distinction between the two is based on whether or not the object is unique. For example, true is a unique object. There is only one true object in the entire system, although there are many references to it. On the other hand, a list object is not unique. There may be many lists in the system, each containing different elements. A unique object responds to the message copy by returning itself and uses identity to test for equality. The general rule is: unique objects usually inherit from traits oddball non-unique objects usually inherit from traits clonable File: rootTraits.self

8.3 Mixins
Like traits objects, mixin objects encapsulate a bundle of shared behavior. Unlike traits objects, however, mixin objects are generally parentless to allow their behavior to be added to an object without necessarily also adding unwanted behavior (such as access to the lobby namespace). Mixins are generally used in objects that also have other parents and mixins are usually given the highest parent priority. Examples include mixins identity and mixins printing. 8.3.1 The Identity Mixin Two objects are usually tested for equality based on whether they have the same value within a common domain. For example, 3.0 = 3 within the domain of numbers, even though they are not the same object or even the same kind of object. In some domains, however, two objects are equal if and only if they are the exact same object. For example, even two process objects with the same

37

The SELF World

Blocks, Booleans, and Control Structures

state are not considered equal unless they are identical. In such cases, identity comparison is used to implement equality tests, and mixins identity can be mixed in to get the desired behavior. File: identity.self 8.3.2 The Printing Mixin Objects that print should understand the four messages printString, printStringSize:, printStringDepth:, and printStringSize:Depth:. By default, an object implements only printString and inherits behavior for the other three messages (dened in terms of printString) from the lobby. However, objects that choose to implement printStringSize:Depth: can inherit from mixins printing to get behavior that denes the other three messages in terms of printStringSize:Depth:. File: printing.self

9 Blocks, Booleans, and Control Structures


A block is a special kind of object containing a sequence of statements. When a block is evaluated, its statements are executed in the context of the current activation of the method in which the block is declared. This allows the statements in the block to access variables local to the blocks enclosing method and any enclosing blocks in that method. (This set of variables comprises the lexical scope of the block.) It also means that within the block, self refers to the receiver of the message that activated the method, not to the block object itself. A return statement in a block causes a return from the blocks enclosing method. (See the SELF Language Reference for a more thorough discussion of block semantics.) A block can take an arbitrary number of arguments and can have its own local variables, as well as having access to the local variables of its enclosing method. The statements in the block are executed when the block is sent a message of the form value[:{With:}], where the number of colons in the message corresponds to the number of arguments the block takes. For example, the following block takes two arguments:
[| :arg1. :arg2 | arg1 + arg2 ]

and can be evaluated by sending it the message value:With: to produce the sum of its arguments. Blocks are used to implement all control structures in SELF and allow the programmer to easily extend the system with customized control structures. In fact, all control stuctures in SELF except message sends, return, and VM error handling are implemented using blocks.

9.1 Booleans and Conditionals


The fundamental control structure is the conditional. In SELF, the behavior of conditionals is dened by two unique boolean objects, true and false. Boolean objects respond to the messages ifTrue:, ifFalse:, ifTrue:False:, and ifFalse:True: by evaluating the appropriate argument block. For example, true implements ifTrue:False: as:

38

The SELF World

Blocks, Booleans, and Control Structures

ifTrue: b1 False: b2 = ( b1 value )

That is, when true is sent ifTrue:False:, it evaluate the rst block and ignores the second. For example, the following expression evaluates to the absolute value of x:
x < 0 ifTrue: [ x negate ] False: [ x ]

The booleans also dene behavior for the logical operations AND (&&), OR (||), EXCLUSIVE-OR (^^), and NOT (not). File: boolean.self

9.2 Loops
The various idioms for constructing loops in SELF are best illustrated by example. Here is an endless loop:
[ ... ] loop

Here are two loops that test for their termination condition at the beginning of the loop:
[ proceed ] [ quit ] whileTrue: [ ... ] whileFalse: [ ... ]

In each case, the block that receives the message repeatedly evaluates itself and, if the termination condition is not yet met, evaluates the argument block. The value returned by both loop expressions is nil. It is also possible to put the termination test at the end of the loop, ensuring that the loop body is executed at least once:
[ ... ] [ ... ] untilTrue: [ quit ] untilFalse: [ proceed ]

Here is a loop that exits from the middle when quit becomes true:
[| :exit | ... quit ifTrue: [ exit value ] ... ] loopExit

For the incurably curious: the parameter to the users block, supplied by the loopExit method, is simply a block that does a return from the loopExit method. Thus, the loop terminates when exit value is evaluated. The constructs loopExitValue, exit, and exitValue are implemented in a similar manner. If this implementation technique seems confusing, dont worry: you can safely use these constructs without understanding how they work.

The value returned by the overall [...] loopExit expression is nil. Here is a loop expression that exits and evaluates to a value determined by the programmer when quit becomes true:
[| :exit | ... quit ifTrue: [ exit value: expr ] ] loopExitValue

File: block.self

39

The SELF World

Numbers and Time

9.3 Block Exits


It is sometimes convenient to exit a block early, without executing its remaining statements. The following constructs support this behavior:
[| :exit | ... quit [| :exit | ... quit ifTrue: ifTrue: [ exit value ] ... ] exit [ exit value: expr ] ... ] exitValue

The rst expression evaluates to nil if the block exits early; the second allows the programmer to dene the expressions value when the block exits early. Be careful! It is easy to confuse these constructs with their looping counterparts loopExit and loopExitValue. File: block.self

9.4 Other Block Behavior


Blocks have some other useful behavior: One can determine the time in milliseconds required to execute a block using various ways of measuring time using the messages userTime, systemTime, cpuTime, and realTime. One can explore various aspects of executing the code in a block using the messages flatProfile, profile, profile:, vmProfile, compilerProfile, trace, and countSends. Any object that inherits from the lobby can be passed to a method that expects a block; behavior in defaultBehavior makes the object behave like a block that evaluates to that object. File: block.self

10 Numbers and Time


The SELF number traits form the hierarchy shown below. (In this and subsequent hierarchy descriptions, indentation indicates that one traits object is a child of another. The prex traits is omitted since these hierarchy descriptions always describe the interrelationship between traits objects. In most cases, leaf traits are concrete and have an associated prototype with the same name.)
oddball number float integer smallInt bigInt traits number denes behavior common to all numbers, such as successor, succ, predecessor, pred, absoluteValue, negate, double, half, max:, and min:. traits number inherits from traits oddball, so sending copy or clone to a number returns the

40

The SELF World

Numbers and Time

number itself. traits integer denes behavior common to all integers such as even, odd, and factorial. There are four division operators for integers that allow the programmer to control how the result is truncated or rounded. Integers also include behavior for iterating through a subrange, including:
to:Do: to:By:Do: to:ByNegative:Do: upTo:Do: upTo:By:Do: downTo:Do: downTo:By:Do:

More interestingly, traits integer inherits from traits indexable, allowing a positive integer to behave like an ordered collection of integers in the interval [0...n-1]. This allows one to write elegant expressions such as:
10 mapBy: [| :x | x * x] "the first nine squares"

Relevant oddballs: infinity IEEE floating-point infinity minSmallInt smallest smallInt in this implementation maxSmallInt biggest smallInt in this implementation Files: number.self, oat.self, integer.self, smallInt.self, bigInt.self

10.1 Random Numbers


clonable random randomLC prototypes random

Traits random denes the abstract behavior of random number generators. A random number generator can be used to generate random booleans, integers, oats, characters or strings. traits randomLC denes a concrete specialization based on a simple linear congruence algorithm. For convenience, the prototype for randomLC is random, not randomLC. Files: random.self

10.2 Time
clonable time

A time object represents a date and time (to the nearest millisecond) since midnight GMT on January 1, 1970. The current time can be queried, returning a new time object. One time can be sub-

41

The SELF World

Collections

tracted from another to produce a value in milliseconds. An offset in milliseconds can be added or subtracted from a time object to produce a new time object. However, it is an error to add two time objects together. Files: time.self

11 Collections
clonable collection ... collection hierarchy ...

Collections are containers that hold zero or more other objects. In SELF, collections behave as if they have a key associated with each value in the collection. Collections without an obvious key, such as lists, use each element as both key and value. Iterations over collections always pass both the value and the key of each element (in that order) to the iteration block. Since SELF blocks ignore extra arguments, this allows applications that dont care about keys to simply provide a block that takes only one argument. Collections have a rich protocol. Additions are made with at:Put:, or with add: or addAll: for implicitly keyed collections. Iteration can be done with do: or with variations that allows the programmer to specify special handling of the rst and/or last element. with:Do: allows pairwise itincludes:, occurrencesOf:, and eration through two collections. The findFirst:IfPresent:IfAbsent: messages test for the presence of particular values in the collection. filterBy:Into: creates a new collection including only those elements that satisfy a predicate block, while mapBy:Into: creates a new collection whose elements are the result of applying the argument block to each element of the original collection. Abstract collection behavior is dened in traits collection. Only a small handful of operations need be implemented to create a new type of collection; the rest can be inherited from traits collection. (See the descendantResponsibility slot of traits collection.) The following sections discuss various kinds of collection in more detail. Files: collection.self (abstract collection behavior)

11.1 Indexable Collections


collection indexable (mixes in abstractString) integer mutableIndexable vector byteVector ... string hierarchy ...

42

The SELF World

Collections

Indexable collections allow random access to their elements via keys that are integers. All strings and vectors are indexable. The message at: is used to retrieve the ith element of an indexable collection while at:Put: is used to update the ith element of a mutableIndexable collection.
traits integer inherits from traits indexable, allowing a positive integer N to behave

like a collection of integers in the range [0..N-1] where the each element is its own key. Behavior for string operations is mixed into traits indexable to make string operations available on any indexable collection of characters. Files: indexable.self, abstractString.self, vector.self

11.2 Strings, Characters, and Paragraphs


collection ... byteVector string mutableString immutableString canonicalString character paragraph

A string is a vector whose elements are character objects. There are four kinds of concrete string: immutable strings, mutable strings, canonical strings, and characters. traits string denes the behavior shared by all strings. Mutable strings can be changed using the message at:Put:, which takes a character argument, or at:PutByte:, which takes an integer argument. An immutable string cannot be modied, but sending it the copyMutable message returns a mutable string containing the same characters. Canonical strings are registered in an table inside the virtual machine, like Symbol objects in Smalltalk or atoms in LISP. The VM guarantees that there is at most one canonical string for any given sequence of bytes, so two canonical strings are equal (have the same contents) if and only if they are identical (are the same object). This allows efcient equality checks between canonical strings. All message selectors and string literals are canonical strings, and some primitives require canonical strings as arguments. Sending canonicalize to any string returns the corresponding canonical string. Characters objects behave like immutable strings of length 1. There are 256 well-known character objects in the SELF universe. They are stored in a 256-element vector named ascii, with each character stored at the location corresponding to its ASCII value. Characters respond to the message asInteger by returning their ASCII value (that is, their index in ascii). The inverse of asInteger, asCharacter, can be sent to an integer between 0 and 255 to obtain the corresponding character object. Characters objects are printed surrounded by backquotes () to distinguish them from strings, which use regular quotes ().

43

The SELF World

Collections

A paragraph is a collection of strings, one for each line of the paragraph. Paragraphs are used by inspect: and the graphical user interface to format text for output to the user. Files: string.self, character.self, paragraph.self

11.3 Unordered Sets and Dictionaries


collection setOrDictionary set dictionary

There are two implementations of sets and dictionaries in the system. The one described in this section is based on hash tables. The one discussed in the following section is based on sorted binary trees.The hash table implementation has better performance over a wide range of conditions. (An unfortunate ordering of element addtions can cause the unbalanced trees used in the tree version to degenerate into an ordered lists, resulting in linear access times.) A set behaves like a mathematical set. It contains elements without duplication in no particular order. A dictionary implements a mapping from keys to values, where both keys and values are arbitrary objects. Dictionaries implement the usual collection behavior plus keyed access using at: and at:Put: and the dictionary-specic operations includesKey: and removeKey:. In order to store an object in a set or use it as a dictionary key, the object must understand the messages hash and =. This is because sets and dictionaries are implemented as hash tables. Files: setAndDictionary.self

11.4 Tree-Based Sets and Dictionaries


collection trees abstract trees set treeSet trees bag treeBag treeNodes abstract treeNodes set treeSetNode treeNodes bag treeBagNode treeSet and treeBag implement sorted collections using binary trees. The set variant ignores

duplicates, while the bag variant does not. Tree sets and bags allow both explicit and implicit keys (that is, adding elements can be done with either at:Put: or add:), where a tree set that uses explicit keys behaves like a dictionary. Sorting is done on explicit keys if present, values otherwise, and the objects sorted must be mutually comparable.

44

The SELF World

Collections

The implementation of trees uses dynamic inheritance to distinguish the differing behavior of empty and non-empty subtrees. The prototype treeSet represents an empty (sub)tree; when an element is added to it, its parent is switched from traits treeSet, which holds behavior for empty (sub)trees, to a new copy of treeSetNode, which represents a tree node holding an element. Thus, the treeSet object now behaves as a treeSetNode object, with right and left subtrees (initially copies of the empty subtree treeSet). Dynamic inheritance allows one object to behave modally without using clumsy if-tests throughout every method. One caveat: since these trees are not balanced, they can degenerate into lists if their elements are added in sorted order. However, a more complex tree data structure might obscure the main point of this implementation: to provide a canonical example of the use of dynamic inheritance. Files: tree.self

11.5 Lists and PriorityQueues


collection list priorityQueue

A list is an unkeyed, circular, doubly-linked list of objects. Additions and removals at either end are efcient, but removing an object in the middle is less so. A priorityQueue an unkeyed, unordered collection with the property that the element with the highest priority is always at the front of the queue. Priority queues are useful for sorting (heapsort) and scheduling. Files: list.self. priorityQueue.self

11.6 Constructing and Concatenating Collections


clonable listMaker collection indexable consVector

Two kinds of objects play supporting roles for collections. A listMaker object is created using the & operator (inherited from defaultBehavior), and represents a collection under construction. The & operator provides a concise syntax for constructing small collections. For example:
(1 & abc & x) asList

constructs a list containing an integer, a string, and the object x. A listMaker object is not itself a collection; it is converted into one using a conversion message such as asList, asVector, or asString.

45

The SELF World

Pairs

A consVector represents the concatenation of two collections. It is usually created using the , operator, which collections inherit from traits collection. For example:
this , is , that

concatenates three strings together. The actual concatenation is done lazily for efciency. This optimization is mostly transparent, but in rare cases (e.g., in primitive wrappers) it may be necessary to explicitly convert the consVector into a normal collection. Files: listMaker.self, consVector.self

12 Pairs
pair point extent rectangle

Traits pair describes the general behavior for pairs of arithmetic quantities. A point is a pair of numbers representing a location on the cartesian plane. An extent is a pair of numbers representing a width and height. A rectangle is a pair of points representing the opposing corners of a rectangle whose sides are parallel with the x and y axes. Files: pairs.self, point.self, extent.self, rectangle.self

13 Mirrors
indexable mirror smallInt float vectorish vector byteVector cannonicalString mirror block method blockMethod activation deadActivation methodActivation blockMethodActivation process assignment slots

46

The SELF World

Messages

Mirrors allow programs to examine and manipulate objects. (Mirrors get their name from the fact that a program can use a mirror to examinethat is, reect uponitself.) A mirror on an object x is obtained by sending the message reflect: x to the lobby. The object x is called the mirrors reectee. A mirror behaves like a keyed collection whose keys are slot names and whose values are mirrors on the contents of slots of the reectee. A mirror can be queried to discover the number and names of the slots in its reectee, which slots are parent slots, and what priority a parent slot has. A mirror can be used to add and remove slots of its reectee. Iterating through a mirror enumerates objects representing slots of the reected object. A special iterator, fakeSlotsDo:, iterates through facets of the object that are not represented by actual slots. (Such facets are called fake slots). For example, a method mirror includes fake slots for the methods byte code and literal vectors and elements of vectors and byteVectors. There are fourteen kinds of mirrors, one for each kind of object known to the virtual machine: small integers, oats, canonical strings, object and byte vectors, mirrors, blocks, ordinary and block methods, ordinary and block method activations, processes, the assignment primitive, and ordinary objects (called slots because an ordinary object is just a set of slots). The prototypes for these mirrors are part of the initial Self world that exists before reading in any script les. The le init.self moves these prototypes to the mirrors subcategory of the prototypes category of the lobby namespace. Because mirrors is not a parent slot, the names of the mirror prototypes always include the mirrors prex. Files: mirror.self, slot.self, init.self, visibility.self

14 Messages
SELF allows messages to be manipulated as objects when convenient. For example, if an object fails to understand a message, the object is notied of the problem via a message whose arguments include the selector of the message that was not understood. While most objects inherit default behavior for handling this situation (by halting with an error), it is sometimes convenient for an object to handle the situation itself, perhaps be resending the message to some other object. Objects that do this are called transparent forwarders. An example is given in interceptor.self. A string has the basic ability to use itself as a message selectors using the messages sendTo: (normal message sends), resendTo: (resends), or sendTo:DelegatingTo: (delegated sends). Each of these messages has a number of variations based on the number of arguments the message has. For example, one would used sendTo:With:With: to send a message with at:Put: as the selector and two arguments. (Note: primitives such as _Print cannot be sent in the current system.) A selector, receiver, delegatee,methodHolder, and arguments can be bundled together in a message object. The message get sent went the message object receives the send message. Message objects are used to describe delayed actions, such as the actions that should occur just before or after a snapshot is read. They are also used as an argument to new process creation. Files: sending.self, message.self, selector.self, interceptor.self

47

The SELF World

Processes and the Prompt

15 Processes and the Prompt


SELF processes are managed by a simple preemptive round-robin scheduler. Processes can be stepped, suspended, resumed, terminated, or put to sleep for a specied amount of time. Also, the stack of a suspended process can be examined and the cpu use of a process can be determined. The prompt object takes input from stdin and spawns a process to evaluate the message. Input to the prompt is kept in a history list so that past input can be replayed, similar to the history mechanism in the Unix C-shell. Files: process.self, scheduler.self, semaphore.self, prompt.self, history.self

16 Foreign Objects
clonable abstractProxy proxy fctProxy foreignFct foreignCode

The low level aspects of interfacing with code written in other languages (via C or C++ glue code) are described in the VM Reference manual. A number of objects in the SELF world are used to interface to foreign data objects and functions. These objects are found in the name spaces traits system foreign, prototypes system foreign, and oddballs system foreign. One difculty in interfacing between SELF and external data and functions is that references to foreign data and functions from within SELF can become obsolete when the SELF world is saved as a snapshot and then read in later, possibly on some other workstation. Using an obsolete reference (i.e., memory address) would be disasterous. Thus, SELF encapsulates such references within the special objects proxy (for data references) and fctProxy (for function references). Such objects are known collectively as proxies. A proxy object bundles some extra information along with the memory address of the referenced object and uses this extra information to detect (with high probability) any attempt to use an obsolete proxy. An obsolete proxy is called a dead proxy. To make it possible to rapidly develop foreign code, the virtual machine supports dynamic linking of this code. This makes it unnecessary to rebuild the virtual machine each time a small change is made to the foreign code. Dynamic linking facilities vary from platform to platform, but the SELF interface to the linking facilities is largely system independent. The SunOS dynamic link interface is dened in the sunLinker object. However, clients should always refer to the dynamic linking facilities by the name linker, which will be initialized to point the the dynamic linker interface appropriate for the current platform. The linker, proxy and fctProxy objects are rather lowlevel and have only limited functionality. For example, a fctProxy does not know which code le it is dependant on. The objects foreignFct and foreignCode establish a higher level and easier to use interface. A foreignCode object represents an object le (a le with executable code). It denes methods for loading

48

The SELF World

I/O and Unix

and unloading the object le it represents. A foreignFct object represents a foreign routine. It understands messages for calling the foreign routine and has associated with it a foreignCode object. The foreignFct and foreignCode objects cooperates with the linker, to ensure that object les are transparently loaded when necessary and that fctProxies depending on an object le are killed when the object le is unloaded etc. The foreignCodeDB object ensures that foreignCode objects are unique, given a path. It also allows for specifying initializers and nalizers on foreignCode objects. An initializer is a foreign routine that is called whenever the object le is loaded. Initializers take no arguments and do not return values. Typically the initialize global data structures. Finalizers are called when an object le is unloaded. Normal use of a foreign routine simply involves cloning a foreignFct object to represent the foreign routine. When cloning it, the name of the function and the path of the object le is specied. It is then not necessary to worry about proxy, fctProxy and linker objects etc. In fact, it is recommended not to send messages directly to these objects, since this may break the higher level invariants that foreignFct objects rely on. Relevant oddballs:
linker sunLinker foreignCodeDB

dynamic linker for current platform dynamic linker implementation for SunOS registry for foreignCode objects

Files: foreign.self

17 I/O and Unix


oddball unix clonable abstractProxy proxy unixFile

The oddball object unix provides access to selected Unix system calls. The most common calls are the le operations: create, open, close, read, write, lseek, unlink. openTCPHost:Port:IfFail: opens a TCP connection. The select call and the indirect system call are also supported (taking a variable number of integer, oat or byte vector arguments, the latter being passed as C pointers). unixFile provides a higher level interface to the Unix le operations. The oddball object tty implements terminal control facilities such as cursor positioning and highlighting. Relevant oddballs:
stdin, stdout, stderr tty

standard Unix streams console terminal capabilities

Files: unix.self, stdin.self, tty.self, termcap.self


49

The SELF World

Other Objects

18 Other Objects
Here are some interesting oddball objects not discussed elsewhere:
codeCache compilerProfiling history memory monitor nil nullChar pathCache platforms preferences prompt

code cache control compiler profiling last n commands typed at prompt and their results garbage collection/scavenging system monitor (spy) control indicates an uninitialized value ASCII null ('\0') maps objects to paths; used for naming possible hardware platforms user configuration preferences interactive read-eval-print loop objects that model slot characteristics for mirrors

profiling, flatProfiling controls SELF code profiling publicSlot, privateSlot, undeclaredSlot scheduler snapshotAction thisHost times typeSizes ui vmProfiling zombies

SELF process scheduler actions to do before/after a snapshot describes the current host platform reports user, system, cpu, or real time bit/byte sizes for primitive types experimental graphical user interface virtual machine profiling recently dead processes, used for debugging

There are many other objects in the base SELF world, including: benchmarks tests programming support the Stanford benchmarks, Richards benchmark, and others a test suite that exercises much of the system support for global enumerations (e.g., finding all senders of a message), debugging, command interpretation, and profiling access to the Sun pixrect library for low-level graphics

pixrects

50

The SELF World

Glossary of Useful Selectors

Appendix E
Copying clone copy Comparing Equality = != hash == !== Ordered

Glossary of Useful Selectors

This glossary lists some useful selectors. It is by no means exhaustive.

shallow copy (for use within an object; clients should use copy) copy the receiver, possibly with embedded copies or initialization

equal not equal hash value identical (the same object; this is reective and should be avoided) not identical less than greater than less than or equal greater than or equal three way comparison three way comparison with failure

< > <= >= compare:IfLess:Equal:Greater: compare:IfLess:Equal:Greater:Incomparable: Numeric operations + * / /= /~ /+ /% absoluteValue inverse negate ceil oor truncate round asFloat asInteger double quadruple

add subtract multiply divide divide exactly (returns oat) divide and round to integer (tends to round up) divide and round up to integer divide and round down to integer mod absolute value multiplicative inverse additive inverse round towards positive innity round towards negative innity truncate towards zero round coerce to oat coerce to integer multiply by two multiply by four
51

The SELF World

Glossary of Useful Selectors

half quarter min: max: mean: pred predecessor succ successor power: log: square squareRoot factorial bonacci sign even odd

divide by two divide by four minimum of receiver and argument maximum of receiver and argument mean of receiver and argument predecessor predecessor successor successor raise receiver to integer power logarithm of argument base receiver, rounded down to integer square square root factorial bonacci signum (-1, 0, 1) true if receiver is even true if receiver is odd

Bitwise operations (integers) && || ^^ complement << >> <+ +> and or xor bitwise complement logical left shift logical right shift arithmetic left shift arithmetic right shift

Logical operations (booleans) && || ^^ not Constructing @ @@ # ## & , Printing print printLine print object on stdout print object on stdout with trailing newline
52

and or xor logical complement

point construction (receiver and argument are integers) extent (size) construction (receiver and argument are integers) rectangle construction (receiver and argument are points) rectangle construction (receiver is a point, argument is an extent) collection construction (result can be converted into collection) concatenation

The SELF World

Glossary of Useful Selectors

printString printStringDepth: printStringSize: printStringSize:Depth: Control Block evaluation value[:{With:}] Selection ifTrue: ifFalse: ifTrue:False: ifFalse:True: Local exiting exit exitValue Basic looping loop loopExit loopExitValue Pre-test looping whileTrue whileFalse whileTrue: whileFalse: Post-test looping untilTrue: untilFalse: Iterators do: to:By:Do: to:Do: upTo:By:Do: upTo:Do: downTo:By:Do: downTo:Do:

return a string label return a string label with depth limitation request return a string label with number of characters limitation request return a string label with depth and size limitation request

evaluate a block, passing arguments evaluate argument if receiver is true evaluate argument if receiver is false evaluate rst arg if true, second arg if false evaluate rst arg if false, second arg if true exit block and return nil if blocks argument is evaluated exit block and return a value if blocks argument is evaluated repeat the block forever repeat the block until argument is evaluated; then exit and return nil repeat the block until argument is evaluated; then exit and return a value repeat the receiver until it evaluates to true repeat the receiver until it evaluates to false repeat the receiver and argument until receiver evaluates to true repeat the receiver and argument until receiver evaluates to false repeat the receiver and argument until argument evaluates to true repeat the receiver and argument until argument evaluates to false iterate, passing each element to the argument block iterate, with stepping iterate forward iterate forward, without last element, with stepping iterate forward, without last element reverse iterate, with stepping reverse iterate

53

The SELF World

Glossary of Useful Selectors

Collections Sizing isEmpty size Adding add: addAll: at:Put: at:Put:IfAbsent: addFirst: addLast: copyAddAll: copyContaining: Removing remove: remove:IfAbsent: removeAll removeFirst removeLast removeAllOccurences: removeKey: removeKey:IfAbsent: copyRemoveAll Accessing rst last includes: occurrencesOf: ndFirst:IfPresent:IfAbsent: at: at:IfAbsent: includesKey: Iterating do: doFirst:Middle:Last:IfEmpty: doFirst:MiddleLast:IfEmpty: doFirstLast:Middle:IfEmpty: doFirstMiddle:Last:IfEmpty: reverseDo: with:Do: iterate, passing each element to argument block iterate, with special behavior for rst and last iterate, with special behavior for rst iterate, with special behavior for ends iterate, with special behavior for last iterate backwards through list co-iterate, passing corresponding elements to block return the rst element return the last element test if element is member of the collection return number of occurences of element in collection evaluate present block on rst element found satisfying criteria, absent block if no such element return element at the given key return element at the given key, evaluating block if absent test if collection contains a given key remove the given element remove the given element, evaluating block if absent remove all elements remove rst element from list remove last element from list remove all occurrences of this element from list remove element at the given key remove element at the given key, evaluating block if absent return an empty copy add argument element to collection receiver add all elements of argument to receiver add key-value pair add key-value pair, evaluating block if key is absent add element to head of list add element to tail of list return a copy containing the elements of both receiver and argument return a copy containing only the elements of the argument test if collection is empty return number of elements in collection

54

The SELF World

Glossary of Useful Selectors

Reducing max mean min sum product reduceWith: reduceWith:IfEmpty: Transforming asByteVector asString asVector asList lterBy:Into: mapBy: mapBy:Into: Sorting sort copySorted copyReverseSorted copySortedBy: sortedDo: reverseSortedDo: sortedBy:Do: Indexable-specic rstKey lastKey loopFrom:Do: copyAddFirst: copyAddLast: copyFrom: copyFrom:UpTo: copyWithoutLast copySize: copySize:FillingWith: Timing realTime cpuTime userTime systemTime totalTime elapsed real time to execute a block cpu time to execute a block cpu time in user process to execute a block cpu time in system kernel to execute a block system + user time to execute a block return the rst key return the last key circularly iterate, starting from element n return a copy of this collection with element added to beginning return a copy of this collection with element added to end return a copy of this collection from element n return a copy of this collection from element n up to element m return a copy of this collection without the last element copy with size n copy with size n, lling in any extra elements with second arg sort receiver in place copy sorted in ascending order copy sorted in descending order copy sorted by custom sort criteria iterate in ascending order iterate in descending order iterate in order of custom sort criteria return a byte vector with same elements return a string with same elements return a vector with same elements return a list with the same elements add elements that satisfy lter block to a collection add result of evaluating map block with each element to this collection add result of evaluating map block with each element to a collection return maximum element return mean of elements return minimum element return sum of elements return product of elements evaluate reduction block with elements evaluate reduction block with elements, evaluating block if empty

55

The SELF World

Glossary of Useful Selectors

Message Sending Sending (like Smalltalk perform; receiver is a string) sendTo:{With:} sendTo:WithArguments: sendTo:DelegatingTo:{With:} sendTo:DelegatingTo:WithArguments: resendTo:{With:} resendTo:WithArguments: Message object protocol send perform the send described by a message object receiver: set receiver selector: set selector methodHolder: set method holder delegatee: set delegatee of the message object arguments: set arguments (packaged in a vector) receiver:Selector: set receiver and selector receiver:Selector:Arguments: set receiver, selector, and arguments receiver:Selector:Type:Delegatee:MethodHolder:Arguments: set all components Reection (mirrors) reect: reectee nameAt: contentsAt: isAssignableAt: isParentAt: isArgumentAt: parentPriorityAt: slotAt: visibilityAt: returns a mirror on the argument returns the object the mirror receiver reects returns the name (a string) of slot n returns a mirror on the contents of slot n tests if slot n is an assignable slot tests if slot n is a parent slot tests if slot n is an argument slot returns the parent priority of slot n returns a slot object representing slot n returns a visibility object representing visibility of slot n send receiver string as a message indirect send with arguments in a vector indirect delegated send indirect delegated send with arg vector indirect resend indirect resend with arguments in a vector

System-wide Enumerations (messages sent to the oddball object browse) all[Limit:] returns a vector of mirrors on all objects in the system (up to the limit) referencesOf:[Limit:] returns a vector of mirrors on all objects referring to arg (up to the limit) referencesOfReectee:[Limit:] returns a vector of mirrors on all objects referring to arguments reectee (up to the limit); allows one to nd references to a method childrenOf:[Limit:] returns a vector of mirrors on all objects with a parent slot referring to the given object (up to the limit) implementorsOf:[Limit:] returns a vector of mirrors on objects with slots whose names match the given selector (up to the limit) sendersOf:[Limit:] returns a vector of mirrors on methods whose selectors match the given selector (up to the limit)
56

The SELF World

Glossary of Useful Selectors

Debugging halt halt: error: warning: Memory garbageCollect scavenge force a full garbage collection force a scavenge halt the current process halt and print a message string halt, print an error message, and display the stack beep, print a warning message, and continue

Virtual Machine-Generated Errors undenedSelector:Type:Delegatee:MethodHolder:Arguments: lookup found no matching slot noPublicSelector:Type:Delegatee:MethodHolder:Arguments: lookup found a matching private slot, but no public one ambiguousSelector:Type:Delegatee:MethodHolder:Arguments: lookup found more than one matching slot missingParentSelector:Type:Delegatee:MethodHolder:Arguments: parent slot through which resend was delegated was not found performTypeErrorSelector:Type:Delegatee:MethodHolder:Arguments: rst argument to the _Perform primitive was not a canonical string mismatchedArgumentCountSelector:Type:Delegatee:MethodHolder:Arguments: number of args supplied to _Perform primitive does not match selector primitiveFailedError:Name: the named primitive failed with given error string Other system-triggered messages preRead postRead preWrite postWrite slot to evaluate before reading a snapshot slot to evaluate after reading a snapshot slot to evaluate before writing a snapshot slot to evaluate after writing a snapshot

57

A Guide to Programming Style


This section discusses some programming idioms and stylistic conventions that have evolved in the SELF group. Rather than simply presenting a set of rules, an attempt has been made to explain the reasons for each stylistic convention. While these conventions have proven useful to the SELF group, they should be taken as guidelines, not commandments. SELF is still a young language, and it is likely that its users will continue to discover new and better ways to use it effectively.

58

SELF-Styled Programming

Behavioralism versus Reflection

19 Behavioralism versus Reection


One of the central principles of SELF is that an object is completely dened by its behavior: that is, how it responds to messages. This idea, which is sometimes called behavioralism, allows one object to be substituted for another without ill effectprovided, of course, that the new objects behavior is similar enough to the old objects behavior. For example, a program that plots points in a plane should not care whether the points being plotted are represented internally in Cartesian or polar coordinates as long as their external behavior is the same. Another example arises in program animation. One way to animate a sorting algorithm is to replace the collection being sorted with an object that behaves like the original collection but, as a side effect, updates a picture of itself on the screen each time two elements are swapped. Behavioralism makes it easier to extend and reuse programs, perhaps even in ways that were not anticipated by the programs author. It is possible, however, to write non-behavioral programs in SELF. For example, a program that examines and manipulates the slots of an object directly, rather than via messages, is not behavioral since it is sensitive to the internal representation of the object. Such programs are called reective, because they are reecting on the objects and using them as data, rather than using the objects to represent something else in the world. Reection is used to talk about an object rather that talking to it. In SELF, this is done with objects called mirrors. There are times when reection is unavoidable. For example, the SELF programming environment is reective, since its purpose is to let the programmer to examine the structure of objects, an inherently reective activity. Except when it is unavoidable, however, reective techniques should be avoided as a matter of style, since a reective program may fail if the internal structure of its objects is change. This places constraints on the situations in which the reective program can be reused, limiting opportunities for reuse and making program evolution more difcult. Programs that depend on object identity are also reective, although this may not be entirely obvious. For example, a program that tests to see if an object is identical to the object true may not behave as expected if the system is later extended to include fuzzy logic objects. Thus, like reection, it is best to avoid using object identity. One exception to this guideline is worth mentioning. When testing to see if two collections are equal, observing that the collections are actually the same object can save a tedious element-by-element comparison. This trick is used in several places in the SELF world. Note, however, that object identity is used only as a hint; the correct result will still be computed, albeit more slowly, if the collections are equal but not identical. Sometimes the implementation of a program requires reection. Suppose one wanted to write a program to count the number of unique objects in an arbitrary collection. The collection could, in general, contain objects of different, possibly incomparable, types. In Smalltalk, one would use an IdentitySet to ensure that each object was counted exactly once. IdentitySets are reective, since they use identity comparisons. In SELF, the preferred way to solve this problem is to make the reection explicit by using mirrors. Rather than adding objects to an IdentitySet, mirrors on the objects would be added to an ordinary set. This substitution works because two mirrors are equal if and only if their reectees are identical. In short, to maximize the opportunities for code reuse, the programmer should: avoid reflection when possible,

59

SELF-Styled Programming

Objects Have Many Roles

avoid depending on object identity except as a hint, and use mirrors to make reflection explicit when it is necessary.

20 Objects Have Many Roles


Objects in SELF have many roles. Primarily, of course, they are the elements of data and behavior in programs. But objects are also used to factor out shared behavior, to represent unique objects, to organize objects and behavior, and to implement elegant control structures. Each of these uses are described below.

20.1 Shared Behavior


Sometimes a set of objects should have the same behavior for a set of messages. The slots dening this shared behavior could be replicated in each object but this makes it difcult to ensure the objects continue to share the behavior as the program evolves, since the programmer must remember to apply the same changes to all the objects sharing the behavior. Factoring out the shared behavior into a separate object allows the programmer to change the behavior of the entire set of objects simply by changing the one object that implements the shared behavior. The objects that share the behavior inherit it via parent slots containing (references to) the shared behavior object. By convention, two kinds of objects are used to hold shared behavior: traits and mixins. A traits object typically has a chain of ancestors rooted in the lobby. A mixin object typically has no parents, and is meant to be used as an additional parent for some object that already inherits from the lobby.

20.2 One-of-kind Objects (Oddballs)


Some objects, such as the object true, are unique; it is only necessary to have one of them in the system. (It may even be important that the system contain exactly one of some kind of object.) Objects playing the role of unique objects are called oddballs. Because there is no need to share the behavior of an oddball among many instances, there is no need for an oddball to have separate traits and prototype objects. Many oddballs inherit a copy method from traits oddball that returns the object itself rather than a new copy, and most oddballs inherit the global namespace and default behavior from the lobby.

20.3 Using Objects for Organization


It is sometimes desirable to organize a large object by grouping related slots together. This is accomplished by moving each set of related slots into a separate object and having the original object inherit this separate object (a category object). The names of the parent slots give each set of slots a suggestive category name. This technique is used in two ways: to organize large traits objects and to organize the set of objects visible from the lobby. (See the SELF World section of the manual for more about lobby namespace organization.)

60

SELF-Styled Programming

Avoiding Ambiguous Message Errors

20.4 Inline Objects


An inline object is an object that is nested in the code of a method object. The inline object is usually intended for localized use within a program. For example, in a nite state machine implementation, the state of the machine might be encoded in a selector that would be sent to an inline object to select the behavior for the next state transition:
state sendTo: (| inComment = ( c = '"' ifTrue: [state: 'inCode']. self ). inCode = ( c = '"' ifTrue: [state: 'inComment'] False: ... ) |) With: nextChar

In this case, the inline object is playing the role of a case statement. Another use of inline objects is to return multiple values from a method, as discussed in section 3.5. Yet another use of inline objects is to parameterize the behavior of some other object. For example, the predicate used to order objects in a priorityQueue can be specied using an inline object:
queue: priorityQueue copyRemoveAll. queue sorter: (| element: e1 Precedes: e2 = ( e1 > e2 ) |).

(A block cannot be used here because the current implementation of SELF does not support nonLIFO blocks, and the sorter object may outlive the method that creates it). There are undoubtedly other uses of inline objects. Inline objects do not generally inherit from the lobby.

21 Avoiding Ambiguous Message Errors


Multiple inheritance introduces a class of potential errors that are not present in languages with single inheritance: ambiguous message errors. An ambiguous message error occurs when there are two or more equally good interpretations for a message. More precisely, message M sent to object O is ambiguous if the SELF lookup algorithm maps the Ms selector to more than one slot. This can only happen when O does not itself have a slot named M, several of Os parents understand M, and two or more of those parents have the same priority. Thus, one way to ensure that no message sent to an object could ever be ambiguous would be to give the object only one parent, but doing that would sacrice the exibility and power of multiple inheritance! In SELF, prioritizing parents can eliminate ambiguous message errors. One could give each parent in an object a different priority, guaranteeing that ambiguous message errors will not be raised, but the actual practice used in the SELF world is make more judicial use of prioritization. Namespace objects and larger traits objects are typically broken into a number of separate objects (category objects) that are pointed to by parent slots in the original object. Parent slots used in this way all share the highest possible priority (one asterisk). The remaining parents are then given lower priorities, different than each other if they need to be disambiguated.

61

SELF-Styled Programming

Naming and Printing

22 Naming and Printing


When debugging or exploring in the SELF world, one often wants to answer the question: what is that object? The SELF environment provides two ways to answer that question. First, many objects respond to the printString message with a textual description of themselves. This string is called the objects printString. An objects printString can be quite detailed; standard protocol allows the desired amount of detail to be specied by the requestor. For example, the printString for a collection might include the printStrings of all elements or just the rst few. Not all objects have printStrings, only those that satisfy the criteria discussed in section 3.4.2 below. The second way to describe an object is to give its path name. A path name is a sequence of unary selectors that describes a path from the lobby to the object. For example, the full path name of the prototype list is prototypes collections ordered list. A path name is also an expression that can be evaluated (in the context of the lobby) to produce the object. Because prototypes, collections, and ordered are all parent slots, they can be omitted from this path name expression. Doing this yields the short path name list. Not all objects have path names, only those that can be reached from the lobby. Such objects are called well-known.

22.1 How objects are printed


When an expression is typed at the prompt, it is evaluated to produce a result object. The prompt then creates a mirror on this result object and asks the mirror to produce a name for the object. (A mirror is used because naming is reective.) If an object has only a printString, then that is its name. If it has only a path name, then a printable representation of its short path name is its name. If it has neither, an attempt is made to identify its dominant parent to produce a nameString of the form: <a child of traits sk>. Finally, if the object has both a printString and a path name, the two pieces of information are merged. For example, the nameString for the prototype list is The list{}, where list{} is the printString of an empty list and The indicates that this particular empty list is the prototype list, as discovered from its path name. See the naming category in mirror traits for the details of this process.

22.2 How to make an object print


The distinction between objects that hold shared behavior (traits and mixin objects) and concrete objects (prototypes, copies of prototypes, and oddballs) is purely a matter of convention; the SELF language makes no such distinction. While this property (not having special kinds of objects) gives SELF great exibility and expressive power, it leads to an interesting problem: the inability to distinguish behavior that is ready for immediate use from that which is dened only for the benet of descendant objects. Put another way: SELF cannot distinguish those objects playing the role of classes from those playing the role of instances. The most prominent manifestation of this problem crops up in object printing. Suppose one wishes to provide the following printString method for all point objects:
printString = ( x printString, @, y printString )

62

SELF-Styled Programming

How to Return Multiple Values

Like other behavior that applies to all points, the method should be put in point traits. But what happens if printString is sent to the object traits point? The printString method is found but it fails when it attempts to send x and y to itself because these slots are only dened in point objects (not the traits point object). Of course there are many other messages dened in traits point that would also fail if they were sent to traits point rather than to a point object. The reason printing is a bigger problem is that it is useful to have a general object printing facility to be used during debugging and system exploration. To be as robust as possible, this printing facility should not send printString when it will fail. Unfortunately, it is difcult to tell when printString is likely to fail. Using reection, the facility can avoid sending printString to objects that do not dene printString. But that is not the case with traits point. The solution taken in this version of the system is to mark printable objects with a special slot name. The printing facility sends printString to the object only if the object contains a slot named thisObjectPrints. The slot must be in the object itself, not merely inherited. The existence of a thisObjectPrints slot in an object means that the object is prepared to print itself. The object agrees to provide behavior for four messages, printString, printStringSize:, printStringDepth:, and printStringSize:Depth:. The size and depth parameters allow the sender to specify the maximum length or recursive printing depth (e.g. for collections). An object that inherits from the lobby needs to implement only printString, since the lobby denes behavior for the other three messages in terms of printString. As an alternative, an object can implement only printStringSize:Depth: and inherit from mixins printing.

23 How to Return Multiple Values


Sometimes it is natural to think of a method as returning several values, even though SELF only allows a method to return a single object. There are two ways to simulate methods that return multiple values. The rst way is to use an inlined object. For example, the object:
(| p* = lobby. lines. words. characters |)

could be used to package the results of a text processing method into a single result object:
count = ( | r = (| p* = lobby. lines. words. characters |) ... | ... r: r copy. r lines: lCount. r words: wCount. r characters: cCount. r )
Note that the inline object prototype inherits copy from the lobby. If one omitted its parent slot p, one would have to send it the _Clone primitive to copy it. It is considered bad style, however, to send a primitive directly, rather than calling the primitives wrapper method.

The sender can extract the various return values from the result object by name. The second way is to pass in one block for each value to be returned. For example:

63

SELF-Styled Programming

Substituting Values for Blocks

countLines:[| :n | lines: n ] Words:[| :n | words: n ] Characters:[| :n | characters: n ]

Each block simply stores its argument into the a local variable for later use. The countLines:Words:Characters: method would evaluate each block with the appropriate value to be returned:
countLines: lb Words: wb Characters: cb = ( ... lb value: lineCount. wb value: wordCount. cb value: charCount. ...

24 Substituting Values for Blocks


The lobby includes behavior for the block evaluation messages. Thus, any object that inherits from the lobby can be passed as a parameter to a method that expects a blockthe object behaves like a block that evaluates that object. For example, one may write:
x >= 0 ifTrue: x False: x negate

rather than:
x >= 0 ifTrue: [ x ] False: [ x negate ]

Note, however, that SELF evaluates all arguments before sending a message. Thus, in the rst case x negate will be evaluated regardless of the value of x, even though that argument will not be used if x is nonnegative. In this case, it doesnt matter, but if x negate had side effects, or if it were very expensive, it would be better to use the second form. In a similar vein, blocks inherit default behavior that allows one to provide a block taking fewer arguments than expected. For example, the collection iteration message do: expects a block taking two arguments: a collection element and the key at which that element is stored. If one is only interested in the elements, not the keys, one can provide a block taking only one argument and the second block argument will simply be ignored. That is, you can write:
myCollection do: [| :el | el printLine]

instead of:
myCollection do: [| :el. :key | el printLine]

64

SELF-Styled Programming

nil Considered Naughty

25 nil Considered Naughty


As in Lisp, SELF has an object called nil, which denotes an undened value. The virtual machine initializes any uninitialized slots to this value. In Lisp, many programs test for nil to nd the end of a list, or an empty slot in a hash table, or any other undened value. There is a better way in SELF. Instead of testing an objects identity against nil, dene a new object with the appropriate behavior and simply send messages to this object; SELFs dynamic binding will do the rest. For example, in a graphical user interface, the following object might be used instead of nil:
nullGlyph = (| display = ( self ). boundingBox = (0@0) # (0@0). mouseSensitive = false. |)

To make it easier to avoid nil, the methods that create new vectors allow you to supply alternative to nil as the initial value for the new vectors elements (e.g., copySize:FillingWith:).

26 Hash and =
Sets and dictionaries are implemented using hash tables. In order for an object to be eligible for inclusion in a set or used as a key in a dictionary, it must implement both = and hash. (hash maps an object to a smallInt.) Further, hash must be implemented in such a way that for objects a and b, (a = b) implies (a hash = b hash). The behavior that sets disallow duplicates and dictionaries disallow multiple entries with the same key is dependent upon the correct implementation of hash for their elements and keys. Finally, the implementation of sets (and dictionaries) will only work, if the hash value of the objects in the set do not change while the objects are in the set (dictionary). This may complicate managing sets of mutable objects, since if the hash value depends on the mutable state, the objects can not be allowed to mutate while in the set. Of course, a trivial hash function would simply return a constant regardless of the contents of the object. However, for good hash table performance, the hash function should map different objects to different values, ideally distributing possible object values as uniformly as possible across the range of small integers.

27 Equality, Identity, and Indistinguishability


Equality, identity, and indistinguishability are three related concepts that are often confused. Two objects are equal if they mean the same thing. For example, 3 = 3.0 even though they are different objects and have different representations. Two objects are identical if and only if they are the same object. (Or, more precisely, two references are identical if they refer to the same object.) The primitive _Eq: tests for identicalness. Finally, two objects are indistinguishable if they have exactly the same behavior for every possible sequence of non-reective messages. The binary operator == tests for indistinguishability. Identity implies indistinguishability which implies equality.

65

SELF-Styled Programming

Equality, Identity, and Indistinguishability

It is actually not possible to guarantee that two objects different are indistinguishable, since reection can be used to change one of the objects so that it behaves differently. Thus, == is dened to mean identity by default. Mirrors, however, override this default behavior; (m1 == m2) if (m1 reflectee _Eq: m2 reflectee). This makes it appear that there is at most one mirror object for each object in the system. This illusion would break down, however, if one added mutable state to mirror objects.

66

Virtual Machine Reference

67

SELF Virtual Machine Reference

System-triggered messages

28 System-triggered messages
Certain events cause the system to automatically send a message to the lobby. The message printIt is sent to the lobby after a full expression has been entered at the prompt. Reading and writing snapshots also trigger messages to be sent to the lobby. Before reading a snapshot, the message snapshotAction preRead is sent; after reading the snapshot, the message snapshotAction postRead is sent. Similarly, the messages snapshotAction preWrite and snapshotAction postWrite are sent surrounding the writing of a snapshot. These messages allow the SELF world to cleanup and reinitialize itselffor example, to open or close les. The following table summarizes the system-triggered messages described above. There are other situations in which the system sends messages; see section 29.
Table 1 Some system-triggered messages
Message printIt snapshotAction preRead snapshotAction postRead snapshotAction preWrite snapshotAction postWrite Trigger event After a full expression has been parsed at the prompt and inserted into the doIt slot in the lobby. Upon invocation of _ReadSnapshot, before the snapshot is read. Upon invocation of _ReadSnapshot, after the snapshot is read. Upon invocation of _WriteSnapshot, before the snapshot is written. Upon invocation of _WriteSnapshot, after the snapshot is written.

68

SELF Virtual Machine Reference

Run-time message lookup errors

29 Run-time message lookup errors


If an error occurs during a message send, the system sends a message to the receiver of the message. Any object can handle these errors by dening (or inheriting) a slot with the corresponding selector. All messages sent by the system in response to a message lookup error have the same arguments. The first argument is the offending messages selector; the additional arguments specify the message send type (one of normal, implicitSelf, undirectedResend, directedResend, or delegated), the directed resend parent name or the delegatee (0 if not applicable), the sending method holder, and an object vector containing the arguments to the message, if any. undefinedSelector:Type:Delegatee:MethodHolder:Arguments: The receiver does not understand the message: no slot matching the selector can be found in the receiver or its ancestors. noPublicSelector:Type:Delegatee:MethodHolder:Arguments: Same as above, except that a private slot would have matched the message had it been public. ambiguousSelector:Type:Delegatee:MethodHolder:Arguments: There is more than one slot matching the selector, and neither the parent slot priorities nor the sender path tiebreaker are sufficient to disambiguate the lookup. missingParentSelector:Type:Delegatee:MethodHolder:Arguments: The parent slot through which the resend should have been directed was not found in the sending method holder. mismatchedArgumentCountSelector:Type:Delegatee:MethodHolder:Arguments: The number of arguments supplied to the _Perform primitive does not match the number of arguments required by the selector. performTypeErrorSelector:Type:Delegatee:MethodHolder:Arguments: The first argument to the _Perform primitive (the selector) wasnt a canonical string. These error messages are just like any other message. Therefore, it is possible that the object causing the error (which is being sent the appropriate error message) does not understand the error message either. More generally, if the system experiences any of these run-time errors in the midst of sending one, it will report the error, print the stack, and abort the SELF program to avoid entering an innite loop of error messages. For example, sending the message boo to an empty object would cause the system to print:
lookup failure: couldnt send undefinedSelector:Type:Delegatee: MethodHolder:Arguments: to <0>, because the empty object did not understand boo and then did not understand undefinedSelector:.... A VM stack dump would follow.

The system will also abort your program and display a stack trace if it runs out of stack space (too much recursion) or if a block is invoked whose lexically-enclosing scope has already returned. Since these errors are non-recoverable they cannot be caught by the same Self process.

69

SELF Virtual Machine Reference

The initial SELF world

30 The initial SELF world


The diagram below shows all objects in the bare SELF world. In addition, literals like integers, oats, and strings are conceptually part of the initial SELF world; block and object literals are created by the programmer as needed. All the objects in the system are created by adding slots to these objects or by cloning them. Table 2 lists all the initial objects and provides a short description for each. Reading in the world rearranges the structure of the bare SELF world ref[world].
lobby ^ snapshotAction _ doIt _ printIt ^ shell ^ parent* ^systemObjects* ^ shortcuts* systemObjects ^ nil ^ true ^ false ^ objVector ^ byteVector ^ proxy ^ fctProxy ^ smiMirror ^ oatMirror ^ stringMirror ^ processMirror ^ byteVectorMirror ^ objVectorMirror ^ assignmentMirror ^ mirrorMirror ^ slotsMirror ^ blockMirror ^ methodMirror ^ blockMethodM. ^ methodActivtnM. ^ blockMethActvtnM. blockMethActvtnMirror ^ parent* ^ parent* false ^ parent* objVector ^ parent* byteVector ^ parent* proxy ^ parent* fctProxy ^ parent* smiMirror ^ parent* oatMirror ^ parent* lobby objVector parent ^ parent* byteVector parent ^ parent* proxy parent ^ parent* fctProxy parent ^ parent* lobby lobby lobby lobby ^ parent* true lobby nil lobby the doIt method ( doIt _Print ) shell ^ postWrite lobby snapshotAction ^ preRead ^ postRead ^ preWrite nil nil nil nil

()

()

()

()

Figure 3 The initial SELF world (part 1)

70

SELF Virtual Machine Reference

The initial SELF world

0 (integers) ^ parent* 0.0 (oats) ^ parent* ( strings) ^ parent* [ ] (blocks) ^ parent* ^ value[:{With:}]

0 (integer) parent ^ parent* 0.0 (oat) parent ^ parent* lobby lobby

( string) parent ^ parent* [ ] (block) parent ^ parent* block method lobby lobby

Figure 4 The initial SELF world (part 2)


Table 2 Objects in the initial SELF world
Object lobby Objects in the lobby doIt printIt shell snapshotAction systemObjects A method synthesized from the expressions typed at the VM prompt. The method invoked by the system after the users input has been successfully parsed. Initially, contains the method ( doIt _Print ). After reading in the world, shell is the context in which expressions typed in at the prompt are evaluated. An object with slots for the four snapshot actions (see section 28). These slots are: preRead, preWrite, postRead, and postWrite. These slots initially contain nil. This object contains slots containing the general system objects, including nil, true, false, and the prototypical vectors and mirrors. The initializer for slots that are not explicitly initialized. Indicates not a useful object. Boolean true. Argument to and returned by some primitives. Boolean false. Argument to and returned by some primitives. The prototype for object vectors. The prototype for byte vectors. The prototype proxy objects. The prototype fctProxy objects. The object that objVector inherits from. Since all object vectors will inherit from this object (because they are cloned from objVector), this object will be the repository for shared behavior (a traits object) for object vectors. Description The center of the SELF object hierarchy, and the context in which expressions typed in at the VM prompt, read in via _RunScript, or used as the initializers of slots, are evaluated.

Objects in systemObjects nil true false objVector byteVector proxy fctProxy objVector parent

71

SELF Virtual Machine Reference

The initial SELF world

byteVector parent mirrors integers 0 parent oats 0.0 parent canonical strings parent blocks

Similar to objVector parent: the byteVector traits object. See below. Integers have one slot, a parent slot called parent. All integers have the same parent: see 0 parent, below. All integers share this parent, the integer traits object. Floats have one slot, a parent slot called parent. All oats have the same parent: see 0.0 parent, below. All oats share this parent, the oat traits object. In addition to a byte vector part, a canonical string has one slot, parent, a parent slot containing the same object for all canonical strings (see parent below). All canonical strings share this parent, the string traits object. Blocks have two slots: parent, a parent slot containing the same object for all blocks (see [] parent, below), and value (or value:, or value:With:, etc., depending on the number of arguments the block takes) which contains the blocks deferred method. All blocks share this parent, the block traits object. All of the prototypical mirrors consist of one slot, a parent slot named parent. Each of these parent slots points to an empty object (denoted in Figure 5 by ( )). See Table 16 for a discussion of mirrors and their reectees.

Literals and their parents

[ ] parent Prototypical mirrors

smiMirror oatMirror stringMirror processMirror byteVectorMirror objVectorMirror assignmentMirror mirrorMirror slotsMirror blockMirror methodMirror blockMethodMirror

Prototypical mirror on a small integer; the reectee is 0. Prototypical mirror on a oat; the reectee is 0.0. Prototypical mirror on a canonical string; the reectee is the empty canonical string (). Prototypical mirror on a process; the reectee is the initial process. Prototypical mirror on a byte vector; the reectee is the prototypical byte vector. Prototypical mirror on object vectors; the reectee is the prototypical object vector. Mirror on the assignment primitive; the actual reectee is an empty object. Prototypical mirror on a mirror; the reectee is slotsMirror. Prototypical mirror on a plain object without code; the reectee is an empty object. Prototypical mirror on a block. Prototypical mirror on a normal method. Prototypical mirror on a block method.

methodActivationMirror Prototypical mirror on a method activation. blockMethodActivationMirror Prototypical mirror on a block activation.

72

SELF Virtual Machine Reference

Option primitives

31 Option primitives
Option primitives control various aspects of the SELF system and its inner workings. Many of them are used to debug or instrument the SELF system and are probably of little interest to users. The options most useful for users are listed in Table 3; other option primitives can be found in Appendix H, and a list of all option primitives and their current settings can be printed with the primitive _PrintOptionPrimitives.
Table 3 Some useful option primitives
Name _PrintPeriod[:]

Description Print a period when reading a script le with _RunScript. Default: false. Print the le name when reading a script le. Default: false. Print a message at every scavenge. Default: false. Print a message at every full garbage collection. Default: false. Start the system monitor (see Appendix G for details). Default: false. Controls the number of stack frames printed by _PrintProcessStack. Default: 20. Controls the number of vector elements printed by _Print. Default: 20. Controls the number of object references remembered by the system. Default: 1000. The default directory for script les. Save compiled code in snapshot les (this signicantly increases the size of snapshots). Default: false. Species which compiler to use to create the initial compiled version of a method. The argument is a string, currently either nic (the default) or new. Species which compiler to use to recompile methods. Currently, valid arguments are nic, new (the default), and none. See also Table 22 on page 107.

_PrintScriptName[:] _PrintScavenge[:] _PrintGC[:] _Spy[:] _StackPrintLimit[:] _VectorPrintLimit[:] _NumObjectIDs[:] _SourceDir[:] _SnapshotCode[:] _FirstCompiler[:] _Recompiler[:]

Each option primitive controls a variable within the virtual machine containing a boolean, integer, or string (in fact, the option primitives can be thought of as primitive variables). Invoking the version of the primitive that doesnt take an argument returns the current setting; invoking it with an argument sets the variable to the new value and returns the old value. If you are at the console of your workstation, try running the system monitor with _Spy: true. The system monitor will continuously display various information about the systems activities and your memory usage.
Caution: the system monitor always writes to /dev/fb, the default frame buffer which usually is the workstations console. If you run SELF on another machine (e.g. via rlogin), the system monitor will write to the screen of the remote machine, not your own screen.

The bracketed colon indicates that the argument is optional (i.e., there are two versions of the primitive, one taking an argument and one not taking an argument). The bracket is not part of the primitive name. See text for details. Formerly called _HirsoryNumber[:]

73

SELF Virtual Machine Reference

Interfacing with other languages

32 Interfacing with other languages


This chapter describes how to access objects and call routines that are written in other languages than SELF. We will refer to such entities as foreign objects and foreign routines. A typical use would be to make a function found in a C library accessible in SELF. Three steps are necessary to accomplish this: Write and compile a piece of glue code that specifies argument and result types for the foreign routine and how to convert between these types and SELF objects. Link the resulting object code to the SELF virtual machine. Create a function proxy object (actually a foreignFct object) that represents the routine in the SELF world. Each of these steps is described in detail in the following sections.

32.1 Proxy and fctProxy objects


A foreign object is represented by a proxy object in the SELF world. A proxy object is an object that encapsulates a pointer to the foreign object it represents. In addition to the pointer to the foreign object, the proxy object contains a type seal. A type seal is an immutable value that is assigned to the proxy object, when it is created. The type seal is intended to capture type information about the pointer encapsulated in the proxy. For example, proxies representing window objects should have a different type seal than proxies representing event objects. By checking the type seal against an expected value whenever a proxy is opened, many type errors can be caught. The last property of proxy objects is that they can be dead or live. If an attempt is made to use the pointer in a dead proxy object, an error results (deadProxyError). Proxy objects may be explicitly killed, by sending the primitive message _Kill to them. Furthermore, they are automatically killed after reading in a snapshot. This way problems with dangling references to foreign objects that were not included in the snapsphot are avoided. FctProxy objects are similar to proxy objects: they have a type seal and are either live or dead. However, they represent a foreign routine, rather than a foreign object. A foreign routine can be invoked by sending the primitive messages _Call, _Call:{With:}, _CallAndConvert{With:And:} to the fctProxy representing it. Note that fctProxy objects are lowlevel. Most, if not all, uses of foreign routines should use the interface provided by foreignFct objects. Proxies (and fctProxies) can be freely cloned, however the copy will be dead. Dead proxies are revived when foreign functions returning pointers (for example) are called. The return value of the foreign function together with a type seal is stored into the dead proxy which is then revived and returned as the result of the foreign routine call. The motivation for this somewhat complicated approach to returning a proxy is that there will be several different kinds of proxies in a typical SELF system. Different kinds of proxies may have different slots added, so rather than having the foreign routine gure out which kind of proxy to clone for the result, the SELF code calling the foreign routine must construct and pass down an empty (dead) proxy to hold the result. This proxy is called a result proxy and it is the last argument supplied to the foreign function.

74

SELF Virtual Machine Reference

Interfacing with other languages

32.2 Glue code


Glue code is responsible for the transition from SELF to foreign routines. It forms wrappers around foreign routines. There is one wrapper per foreign routine. A wrapper takes a number of arguments of type oop, and returns an oop (oop is the C++ type for reference to SELF object). When a wrapper is executed, it performs the following steps: 1. Check that the arguments supplied have the correct types. 2. Convert the arguments from SELF representation to the representation that the foreign routine needs. 3. Invoke the foreign routine on the converted arguments. 4. Convert the return value of the foreign routine to a SELF object and return this as the SELF level result. To make it easier to write glue code, a special purpose language has been designed for this. The result is that glue for a foreign routine will often consist of only a single line. The glue language is implemented as a set of C++ preprocessor macros. Therefore, glue code is just a (rather peculiar) kind of C++. Glue code can be in a le of its own, or - if it is glue for calling C++ routines - it can be in the same le as the foreign routines, and compiled with them. To make the denition of the glue language available, the le containing glue code must contain:
# include "_glueDefs.c.incl"

The le _glueDefs.c.incl includes a bunch of C++ header les that contain all the denitions necessary for the glue. Of the included les, glueDefs.h is probably the most interesting in this context. It denes the glue language and also contains some comments explaining it. Since different foreign languages have different type systems and calling conventions the glue language is actually not a single language, but one for each supported foreign language. Presently C and C++ are supported. Section 32.5 describes C glue and section 32.6 describes C++ glue.

32.3 Compiling and linking glue code


Since glue code is a special form of C++ code, a C++ compiler is needed to translate it. The way this is done may depend on the computer system and the available C++ compiler. The following description applies to Sun SPARCstations using the GNU g++ compiler. A specic example of how to compile glue code can be found in the directory containing the toself demo (see section 32.8 for further details). The makele in that directory describes how to translate a .c le containing glue into something that can be invoked from SELF. This is a two stage process: rst the .c le is compiled into a .o le which is then linked (perhaps with other .o les and libraries that the glue code depends on) into a .so le (a so-called dynamic library). While

Note that many libraries are already included in the SELF virtual machine (e.g. libc.a) and hence should not be added to the dynamic library.
75

SELF Virtual Machine Reference

Interfacing with other languages

the compilation is straight forward a couple of issues concerning the linking needs to be explained and motivated. Linking. Before a foreign routine can be called it must be linked to the SELF virtual machine. The linking can be done either statically, i.e. before SELF is started, or dynamically, i.e. while SELF is running. The SELF system employs dynamic linking. The choice between dynamic and static linking involves a trade-off between safety and exibility as outlined in the following. Dynamic linking has the advantage that it is done on demand, so only foreign routines that are actually used in a particular session will be loaded and take up space. Debugging foreign routines is also easier, especially if the dynamic linker supports unlinking. The main disadvantages with dynamic linking is that more things can go wrong at run time. For example, if an object le containing a foreign routine can not be found, a run time error occurs. The Sun OS dynamic linker, ld.so, only handles dynamic libraries which explains why the second stage of glue translation is necessary. Static linking, the alternative that was not chosen for SELF, has the advantage that it needs to be done only once. The statically linked-in les will then be available for ever after. The main disadvantages are that the linked-in les will always take up space whether used or not in a given SELF session, that the VM must be completely relinked every time new code is added, and that debugging is harder because there is no way to unlink code with bugs in. For these reason the following examples all use dynamic linking.

32.4 A simple glue example: calling a C++ function


Suppose we have a C++ function that encrypts text strings in some fancy way. It takes two arguments, a string to encrypt and a key, and returns a string which is the result of the encryption. To use this function from SELF, we write a line of C++ glue. Here is the entire le, encrypt.c, containing both the encryption function and the glue:
/* Make glue available by including it. */ # include "...some_path.../_glueDefs.c.incl" /* Naive encryption function. */ char *encrypt(char *str, int key) { static char res[1000]; int i = 0; while (str[i]) res[i] = str[i++] + key; res[i] = \0; return res; } /* Make glue expand to full functions, not just prototypes. */ # define WHAT_GLUE FUNCTIONS C_func_2(string,, encrypt, encrypt_glue, string,, int,) # undef WHAT_GLUE

If you try this example, be sure to type in all the "double" commas - they are necessary because of technical details with C++ macros.
76

SELF Virtual Machine Reference

Interfacing with other languages

A few words of explanation: the last three lines of this le contain the glue code. First dening WHAT_GLUE to be FUNCTIONS, makes the following line expand into a full wrapper function (dening WHAT_GLUE to be PROTOTYPES instead, will cause the C_func_2 line to produce a function prototype only). The line containing the macro C_func_2 is the actual wrapper for encrypt. The 2 designates that encrypt takes 2 arguments. The meaning of the arguments, from left to right are: string,: specifies that encrypt returns a string argument. encrypt: name of function we are constructing wrapper for. encrypt_glue: name that we want the wrapper functions to have. string,: specifies that the first argument to encrypt is a string. int,: specifies that the second argument to encrypt is an int. Having written this le, we now compile it with the Unix command gluecc encrypt.c. The resulting le, encrypt.o can be transformed into a shared object with the command glueld encrypt.so encrypt.o (no other les are needed in this simple example). Finally, to try it out, we can - at the SELF prompt - type these commands:
> _AddSlotsIfAbsent: ( | encrypt | ) lobby > encrypt: ( foreignFct copyName: encrypt_glue Language: C++ Path: encrypt.so ) lobby > encrypt <C++ function(encrypt_glue)> > encrypt value: Hello Self With: 3 Khoor#Vhoi > encrypt value: Khoor#Vhoi With: -3 Hello Self

Comparing the signature for the function encrypt with the arguments to the C_func_2 macro it is clear that there is a straightforward mapping between the two. One day we hope to nd the time to write a SELF program that can parse a C or C++ header le and generate glue code corresponding to the denitions in it. In the meantime, glue code must be hand-written.

32.5 C glue
C glue supports accessing C functions and data from SELF. There are three main parts of C glue: Calling functions. Reading/assigning global variables. Reading/assigning a component in a struct that is represented by a proxy object in SELF.

77

SELF Virtual Machine Reference

Interfacing with other languages

In addition, C++ glue for creating objects can be used to create C structs (see section 32.6). The following sections describe each of these parts of C glue. 32.5.1 Calling C functions The macro C_func_N where N is 0, 1, 2, ... is used to glue in a C function. The number N denotes the number of arguments that should be given at the SELF level, when calling the function. This number may be different from the number of arguments that the C function takes since, e.g., some argument conversions (see below) produce two C arguments from one SELF object. Here is the general syntax for C_func_N:
C_func_N(res_cnv,res_aux, fexp, gfname, fail_opt, c0,a0, ... cN,aN)

Compare this with the glue that was used in the encrypt example in section 32.4:
C_func_2(string,, encrypt, encrypt_glue,, string,, int,)

The meaning of each argument to C_func_N is as follows: res_cnv,res_aux: these two arguments form a conversion pair that specifies how the result that the function returns is converted to a SELF object. In the encrypt example, where the function returns a null terminated string, res_cnv has the value string, and res_aux is empty. Table 4 lists all the possible values for the res_cnv,res_aux pair. fexp is a C expression which evaluates to the function that is being glued in. In the simplest case, such as in the encrypt example, the expression is the name of a function, but in general it may be any C expression, involving function pointers etc., which in a global context evaluates to a function. gfname: the name of the function which the C_func_N macro expands into. In the encrypt example, the convention of appending _glue to the C functions name was used. When accessing a glued in function from SELF, the value of gfname is the name that must be used. fail_opt: there are two possible values for this argument. It can be empty (as in the example) or it can be fail. In the latter case, the C function being called, is passed an additional argument that will be the last argument and have type void *. Using this argument, the C function may abort its execution and raise an exception. The result is that the IfFail block in SELF will be invoked. ci,ai: each of these pairs describes how to convert a SELF level argument to one or more C level arguments. For example, in the glue for encrypt, c0,a0 specifies that the first argument to encrypt is a string. Likewise c1,a1 specifies that the second argument is an integer. Note that in both these cases, the a-part of the conversion is empty. Table 4 lists all the possible values for the ci,ai pair. Handling failures. Here is a slight modication of the encryption example to illustrate how the C function can raise an exception that causes the IfFail block to be invoked at the SELF level:

The any conversion is the lone exception: it takes two SELF objects and produces one C argument.

78

SELF Virtual Machine Reference

Interfacing with other languages

/* Make glue available by including it. */ # include "...some_path.../_glueDefs.c.incl" /* Naive encryption function. */ char *encrypt(char *str, int key, void *FH) { static char res[1000]; int i = 0; if (key == 0) { failure(FH, "key == 0 is identity map"); return NULL; } while (str[i]) res[i] = str[i++] + key; res[i] = \0; return res; } /* Make glue expand to full functions, not just prototypes. */ # define WHAT_GLUE FUNCTIONS C_func_2f(string,, encrypt, encrypt_glue, fail, string,, int,) # undef WHAT_GLUE

Observe that the fail_opt argument now has the value fail and that the encrypt function raises an exception, using failure, if the key is 0. There are two ways to raise exceptions:
extern "C" void failure(void *FH, char *msg); extern "C" void unix_failure(void *FH, int err = -1);

In both cases, the FH argument is the failure handle that was passed by the C_func_N macro. The second argument to failure is a string. It will be passed to the IfFail block in SELF. unix_failure takes an optional integer as its second argument. If this integer has the value -1, or is missing, the value of errno is used instead. The integer is interpreted as a Unix error number, from which a corresponding string is constructed. The string is then, as for failure, passed to the IfFail block at the call site in SELF. A word of warning: after calling failure or unix_failure a normal return must be done. The value returned (in the example NULL) is ignored. 32.5.2 Reading and assigning global variables Reading the value of a global variable is done using the C_get_var macro. Assigning a value to a global varibale is done using C_set_var. Both macros expand into a C++ function that converts between SELF and C representation, and reads or assigns the variable. Here is the general syntax:
C_get_var(cnvt_res,aux_res, expr, gfname) C_set_var(var, expr_c0,expr_a0, gfname)

A concrete example is reading the value of the variable errno, which can be done using:
C_get_var(int,, errno, get_errno_glue)
79

SELF Virtual Machine Reference

Interfacing with other languages

The meaning of the each argument is: cnvt_res,aux_res: how to convert the value of the global variable that is being read to a SELF object. In the errno example, cnvt_res is int and aux_res is empty, since the type of errno is int. The cnvt_res,aux_res can be any one of the result conversions found in Table 4. expr is the variable whose value is being read. In the errno example, it is simply errno, but in general, it may actually be any expression that is valid in a global context, even an expression involving function calls. gfname: the name of the C++ function that C_get_var or C_set_var expands into. var is the name of a global variable that a value is assigned to. In general, var, may be any expression that in a global context evaluates to an l-value. expr_c0,expr_a0: when assigning to a variable, the value it is assigned is obtained by converting a SELF object to a C value. The expr_c0,expr_a0 pair, which can be any one of the argument conversions listed in Table 4, specifies how to do this conversion. 32.5.3 Reading and assigning struct components Reading the value of a struct component or assigning a value to it is similar to doing the same operations on a global variable. The difference is that the struct must somehow be specied. This is taken care of by the macros C_get_comp and C_set_comp. The general syntax is:
C_get_comp(cnvt_res,aux_res, cnvt_strc,aux_strc, comp, gfname) C_set_comp(cnvt_strc,aux_strc, comp, expr_c0,expr_a0, gfname)

Here is an example, assigning to the sin_port eld of a struct sockaddr_in (this struct is dened in /usr/include/netinet/in.h). The struct is represented by a proxy object:
char *socks = "type seal for sockaddr_in proxies"; C_set_somp(proxy,(sockaddr_in *,socks), .sin_port, short,, set_sin_port_glue)

The sockaddr_in example denes a function, set_sin_port_glue, which can be called from SELF. The function takes two arguments, the rst being a proxy representing a sockaddr_in struct, the second being an integer. After converting types, set_sin_port_glue performs the assignement
(*first_converted_arg).sin_port = second_converted_arg.

In general the meaning of the C_get_comp and C_set_comp arguments is: cnvt_res,aux_res: how to convert the value of the component that is being read to a SELF object. Any of the result conversions found in Table 4 may be applied.

80

SELF Virtual Machine Reference

Interfacing with other languages

cnvt_strc,aux_strc: the conversion that is applied to produce a struct upon which the operation is performed. In the sin_port example, this conversion is a proxy conversion, implying that in SELF, the struct whose sin_port component is assigned, is represented by a proxy object. In general, any of the argument conversions from Table 4 that results in a pointer, may be used. comp is the name of the component to be read or assigned. In the sin_port example, this name is .sin_port. Note that it includes a .. This, e.g., allows handling pointers to ints by pretending that it is a pointer to a struct and operating on a component with an empty name. gfname: the name of the C++ function that C_get_comp or C_set_comp expands into. expr_co,expr_a0: when assigning to a component, the value it is assigned is obtained by converting a SELF object to a C value. The expr_co,expr_a0 pair, which can be any one of the argument conversions listed in Table 4, specifies how to do this conversion.

32.6 C++ glue


Since C++ is a superset of C, all of C glue can be used with C++. In addition, C++ glue provides support for: Constructing objects using the new operator. Deleting objects using the delete operator. Calling member functions on objects. Each of these parts will be explained in the following sections. 32.6.1 Constructing objects In C++ objects are constructed using the new operator. Constructors may take arguments. The macros CC_new_N where N is a small integer, supports calling constructors with or without arguments. Calling a constructor is similar to calling a function, so for additional explanation, please refer to section 32.5.1. Here is the general syntax for constructing objects using C++ glue:
CC_new_N(cnvt_res,aux_res, class, gfname, c0,a0, c1,a1, ... cN,aN)

For example, to construct a sockaddr_in object, the following glue statement could be used:
CC_new_0(proxy,(sockaddr_in *,socks), sockaddr_in, new_sockaddr_in)

The meaning of the CC_new_N arguments is as follows: cnvt_res,aux_res: the result of calling the constructor is an object pointer. The result conversion pair cnvt_res,aux_res (cf. Table 4), specifies how this pointer is converted to a SELF object before being returned. In the sockaddr example, the proxy result conversion is used.

sockaddr_in is actually not a C++ class, but a C struct. However, C++ treats structs and classes the same.

81

SELF Virtual Machine Reference

Interfacing with other languages

class is the name of the class (or struct) that is being instantiated. gfname: the name of the C++ function that the CC_new_N macro expands into. ci,ai: if the constructor takes arguments, these arguments must be converted from SELF representation to C++ representation. The arguments conversion pairs ci,ai specify how each argument is converted. See Table 4 for a description of all argument conversions. In the sockaddr example, there are no arguments. 32.6.2 Deleting objects C++ objects can have destructors that are executed when the objects are deleted. To ensure that the destructor is called properly, the delete operator must know the type of the object being deleted. This is ensured by using the CC_delete macro, which has the following form:
CC_delete(cnvt_obj,aux_obj, gfname)

For example, to delete sockaddr_in objects (constructed as in the previous section), the CC_delete macro should be used in this manner:
CC_delete(proxy,(sockaddr_in *,socks), delete_sockaddr_in)

In general, the meaning of the arguments given to CC_delete is: cnvt_obj,aux_obj: this pair can be any of the argument conversions found in Table 4 that produces a pointer - a pointer to the object that will be deleted. gfname: the name of the C++ function that this invocation of CC_delete expands into. 32.6.3 Calling member functions Calling member functions is similar to calling plain functions, so please also refer to section 32.5.1. The difference is that an additional object must be specied: the object upon which the member function is invoked (the receiver in SELF terms). Calling a member function is accomplished using one of the macros
CC_mber_N(cnvt_res,aux_res, cnvt_rec,aux_rec, mname, gfname, fail_opt, c0,a0, c1,a1, ..., cN,aN)

For example here is how to call the member function zock on a sockaddr_in object given by a proxy:
CC_mber_N(bool,, proxy,(sockaddr_in *,socks), zock, zock_glue,)

The arguments to CC_mber_N are: cnvt_res,aux_res: this pair, which can be any of the result conversions from Table 4, specifies how to convert the result of the member function before returning it to SELF. For example, the zock member function returns a boolean.

In fact there is no such member function dened on sockaddr_in objects. [a

better example needed]

82

SELF Virtual Machine Reference

Interfacing with other languages

cnvt_rec,aux_rec: the object on which the member function is invoked, is obtained using this argument conversion. Often this will be a proxy conversion as in the zock example. mname is the name of the member function. In general, it may be any expression, such that receiver->mname evaluates to a function. gfname is the name of the C++ function that the CC_mber_N macro expands into. fail_opt: whether or not to pass a failure handle to the member function (refer to section 32.5.1 for details). ci,ai: these are argument conversion pairs specifying how to obtain the arguments for the member function. Any conversion pair found in Table 4 may be used.

32.7 Conversion pairs


A major function of glue code is to convert between SELF objects and C/C++ values. This conversion is guarded by so-called conversion pairs. A conversion pair is a pair of arguments given to a glue macro. It handles converting one or at most a few types of objects/values. There are different conversion pairs for converting from SELF objects to C/C++ values (called argument conversion pairs) and for converting from C/C++ values to SELF objects (called result conversion pairs). 32.7.1 Argument conversions - from SELF to C/C++ An argument conversion is given a SELF object and performs these actions to produce a corresponding C or C++ value: check that the SELF object it has been given is among the allowed types. If not, report badTypeError (invoke the failure block (if present) with the argument badTypeError). check that the object can be converted to a C/C++ value without overflow or any other error. If not, report the relevant error. do the conversion, i.e. construct the C/C++ value corresponding to the given SELF object. Table 4 lists all the available argument conversions. Each row represents one conversion, with the rst two columns designating the conversion pair. The third column lists the types of SELF objects that the conversion pair accepts. The fourth column lists the C types that it produces. The fth column lists the kind of errors that can occur during the conversion. Finally, the sixth column contains references to numbered notes. The notes are found in the paragraphs following the table.

The any conversion is the only conversion that has more than one incoming object.

83

SELF Virtual Machine Reference

Interfacing with other languages

Table 4 : Argument conversions - from SELF to C/C++ Conversion


bool char signed_char unsigned_char

Second part

SELF type
boolean integer integer integer

C/C++ type
int, 0 or 1 char signed char unsigned char

Errors
badTypeError badTypeError overowError badTypeError overowError badSignError badTypeError overowError badTypeError overowError badTypeError overowError badSignError badTypeError overowError badTypeError badTypeError badSignError badTypeError badTypeError badTypeError badSignError badTypeError badSignError badTypeError badTypeError badTypeError badTypeError badTypeError badSizeError badTypeError badTypeError 2 2 3 3 3 4 4, 5 4, 6 1

Notes

short signed_short unsigned_short

integer integer integer

short signed short unsigned short

int signed_int unsigned_int long signed_long unsigned_long smi unsigned_smi oat double long_double bv bv_len bv_null ptr_type ptr_type ptr_type

integer integer integer integer integer integer integer integer oat oat oat byte vector byte vector byte vector/0

int signed int unsigned int long signed long unsigned long smi smi oat double long double ptr_type ptr_type, int ptr_type

84

SELF Virtual Machine Reference

Interfacing with other languages

Table 4 : Argument conversions - from SELF to C/C++ Conversion


bv_len_null cbv cbv_len cbv_null cbv_len_null string string_len string_null string_len_null proxy (ptr_type, type_seal)

Second part
ptr_type ptr_type ptr_type ptr_type ptr_type

SELF type
byte vector/0 byte vector byte vector byte vector/0 byte vector/0 byte vector byte vector byte vector/0 byte vector/0 proxy

C/C++ type
ptr_type, int ptr_type ptr_type, int ptr_type ptr_type, int char * char *, int char * char *, int ptr_type, != NULL

Errors
badSizeError badTypeError badTypeError badSizeError badTypeError badTypeError badSizeError badTypeError badTypeError nullCharError badTypeError nullCharErrorj badTypeError nullCharErrorj badTypeError nullCharErrorj badTypeError badTypeSealError deadProxyError, nullPointerError badTypeError badTypeSealError deadProxyError

Notes
4, 5, 6 7 7 7 7 8 5, 8 6, 8 5, 6, 8 9

proxy_null

(ptr_type, type_seal)

proxy

ptr_type

any_oop oop any oop subtype C/C++ type

any object corr. object int/oat/proxy/ byte-vector, int

oop oop (subtype) int/oat/ptr/ ptr badTypeError badIndexError badTypeError deadProxyError

10 11 12

1. The C type char has a system dependent range. Either 0..255 or -128..127. 2. The type smi is used internally in the virtual machine (a 30 bit integer). 3. Presicion may be lost in the conversion. 4. The second part of the conversion is a C pointer type. The address of the rst byte in the byte vector, cast to this pointer type, is passed to the foreign routine. It is the responsibility of the foreign routine not to go past the end of the byte vector. The foreign routine should not retain pointers into the byte vector after the call has terminated. Note: canonical strings can not be passed through a bv conversion (badTypeError will result). This is so to guarantee that they are not accidentally modied by a foreign function.

85

SELF Virtual Machine Reference

Interfacing with other languages

5. This conversion passes two values to the foreign routine: a pointer to the rst byte in the byte vector, and an integer which is the length of the byte vector divided by sizeof(*ptr_type). If the size of the byte vector is not a multiple of sizeof(*ptr_type), badSizeError results. 6. In addition to accepting a byte vector, this conversion accepts the integer 0, in which case a NULL pointer is passed to the foreign routine. 7. The cbv conversions are like the bv conversions except that canonical strings are allowed as actual arguments. A cbv conversion should only be used if it is guaranteed that the foreign routine does not modify the bytes it gets a pointer to. 8. All the string conversions take an incoming byte vector, copy the bytes part, add a trailing null char, and pass a pointer to this copy to the foreign routine. After the call has terminated, the copy is discarded. If the byte vector contains a null char, nullCharError results. 9. The type_seal is an int or char * expression that is tested against the type seal value in the proxy.If the two are different, badTypeSealError results. The special value ANY_SEAL will match the type seal in any proxy. Note that the proxy conversion will fail with nullPointerError, if the proxy object it is given, encapsulates a NULL pointer. 10. The any_oop conversion is an escape: it passes the SELF object unchanged to the foreign routine. 11. The oop conversion is mainly intended for internal use. The second argument is the name of an oop subtype. After checking that the incoming argument points to an instance of the subtype, the pointer is cast to the subtype. 12. The any conversion is different from all other conversions in that it expects two incoming SELF objects. The actions of the conversion depends on the type of the rst object in the following way. If the rst object is an integer, the second argument must also be an integer; the two integers are converted to C ints, the second is shifted 16 bits to the left and they are ored together to produce the result. If the rst object is a oat, it is converted to a C float and the second object is ignored. If the rst object is a proxy, the result is the pointer represented by the proxy, and the second argument is ignored. If the rst object is a byte vector, the second object must be an integer which is interpreted as an index into the byte vector; the result is a pointer to the indexed byte. 32.7.2 Result conversions - from C/C++ to SELF A result conversion is given a C or C++ value of a certain type and performs these actions to produce a corresponding SELF object: check that the C/C++ value can be converted to a SELF object with no overflow or other error occurring. If not, report the error. do the conversion, i.e. construct the SELF object corresponding to the given C/C++ value. Table 5 lists all the available result conversions. Each row represents one conversion, with the rst two columns designating the conversion pair. The third column lists the type of C or C++ value that the conversion pair accepts. The fourth column lists the type of SELF object the conversion produces. The fth column lists the kind of errors that can occur during the conversion. Finally, the sixth column contains references to numbered notes. The notes are found in the paragraphs following the table.

86

SELF Virtual Machine Reference

Interfacing with other languages

. Table 5 : Result conversions - from C/C++ to SELF Conversion


void bool char signed_char unsigned_char short signed_short unsigned_short int signed_int unsigned_int long signed_long unsigned_long smi int_or_errno oat double long_double string proxy proxy_null proxy_or_errno (ptr_type, type_seal) (ptr_type, type_seal) (ptr_type, type_seal, n) (ptr_type, type_seal, arg_count) n

Second part

C/C++ type
void int char signed char unsigned char short signed short unsigned short int signed int unsigned int long signed long unsigned long smi int oat double long double char * ptr_type ptr_type ptr_type

SELF type
integer, 0 boolean integer integer integer integer integer integer integer integer integer integer integer integer integer int oat oat oat byte vector proxy proxy proxy

Errors

Notes

overowError overowError overowError overowError overowError overowError overowError a unix error 13 14 14 14 nullPointerError nullPointerError 15 15, 16, 20 16, 20 a unix error 16, 17, 20

fct_proxy

ptr_type

fctProxy

nullPointerError

15, 18, 20

87

SELF Virtual Machine Reference

Interfacing with other languages

Table 5 : Result conversions - from C/C++ to SELF Conversion


fct_proxy_null

Second part
(ptr_type, type_seal, arg_count)

C/C++ type
ptr_type

SELF type
fctProxy

Errors

Notes
18, 20

oop

oop

corr. object

19, 20

13. This conversion returns an integer value, unless the integer has the value n (the second part of the conversion; often -1). If the integer is n, the conversion interpretes the return value as a Unix error indicator. It then constructs a string describing the error (by looking at errno) and invokes the IfFail block with this string. 14. Precision may be lost. 15. This conversion fails with nullPointerError if attempting to convert a NULL pointer. 16. The ptr_type is the C/C++ type of the pointer. The type_seal is an expression of type int or char *.The conversion constructs a new proxy object, stores the C/C++ pointer in it and sets its type seal to be the value of type_seal. 17. If the pointer is n (often n is NULL), the conversion fails with a Unix error, similar to the way int_or_errno may fail. 18. The fct_proxy, fct_proxy_null and fct_proxy_or_errno conversions are similar to the corresponding proxy conversions. The difference is that they produce a fctProxy object rather than a proxy object. Also, their second part is a triple rather than a pair. The extra component species how many arguments the function takes, if called. The special keyword unknownNoOfArgs or any non-negative integer expression can be used here. 19. This conversion is an escape: it passes the C value unchanged to SELF. It is an error to use it if the C value is not an oop. 20. The proxy (fctProxy) object that is returned by these conversions is not being created by the glue code. Rather a proxy (fctProxy) must be passed down from the SELF level. This proxy (fctProxy), a result proxy, will then be side effected by the glue: the value that the foreign function returns will be stored in the result proxy together with the requested type seal. It is required that the result proxy is dead when passed down (else a liveProxyError results). After being side-effected and returned, the result proxy is live. The result proxy is the last argument of the function that the glue macro expands to.

32.8 A complete application using foreign functions


This section gives a description of a complete application which uses foreign functions. Before reading this section, please take a look at this info doesnt exist anymore which introduces foreignFct objects (a higher level interface to foreign functions than that provided by fctProxy objects). The aim of this section is to present a realistic and complete example of how foreign functions may be used. The complete source for the example is found in the directory serverDemo which is included in the SELF distribution.

88

SELF Virtual Machine Reference

Interfacing with other languages

The example used is an application that allows SELF expressions to be easily evaluated by nonSELF processes. Having this, it then becomes possible to start SELF processes from a UNIX prompt (shell) or to specify pipe lines in which some of the processes are SELF processes. For example in
proto% cat someFile | tokenize | sort -r | capitalize | tee lst

it may be the case that the lters tokenize and capitalize perform most of their work in SELF. Likewise, the command
proto% mail

may invoke some fancy mail reader written in SELF rather than /usr/ucb/mail. To see how the above can be accomplished, please refer to Figure 5 below. The left side of the gure shows the external view of a typical UNIX process. It has two les: stdin and stdout (for simplicity we ignore stderr). Stdin is often connected to the keyboard so that characters typed here can be read from the le stdin. Likewise, stdout is typically connected to the console so that the process can display output by writing it to the le stdout. Stdin and stdout can also be connected to regular les, if the process was started with redirection. The right side of Figure 5 shows a two stage pipe line. Here stdout of the rst process is connected to stdin of the second process.
wc ls wc

stdin

stdout

stdin

stdout

stdin

stdout

Figure 5. A single UNIX process and an pipe line Figure 5 illustrates a simple trick that in many situations allows SELF processes to behave as if they are full-edged UNIX processes. A SELF process is represented by a real UNIX process which transparently communicates with the SELF process over a pair of connected sockets. The communication is bi-directional: input to the UNIX process is relayed to the SELF process over the socket connection, and output produced by the SELF process is sent over the same socket connection to the UNIX process which relays it to stdout. The right part of Figure 5 shows how the UNIX/ SELF process pair can t seamlessly into a pipe line.

89

SELF Virtual Machine Reference

Interfacing with other languages

stdin

capitalize

stdout

stdin

ls

stdout

stdin

capitalize

stdout

Self VM capitalize: stdio

Self VM capitalize: stdio

Figure 6. A SELF process and how it ts into a pipe line Source code that facilitates setting up such UNIX/SELF process pairs is included in the SELF distribution. The source consists of two parts: one being a SELF program (called server), the other being a C++ program (called toself). When the server is started, it creates a socket, binds a name to it and then listens for connections on it. toself establishes connections to the server program. The rst line that is transmitted when a connection has been set up goes from toself to the server. The line contains a SELF expression. Upon receiving it, the server forks a new process to evaluate the expression in the context of the lobby augemented with a slot, stdio, that contains a unixFile-like object that represents the socket connection. When the forked process terminates, the socket connection is shut down. The toself UNIX process then terminates. The SELF expression that forms the SELF process is specied on the command ine when toself is started. For esxample, if the server has been started, the following can be typed at the UNIX prompt:
proto% toself stdio writeLine: 5 factorial printString 120 proto% echo something | toself capitalize: stdio SOMETHING proto% toself capitalize: stdio Write some text that goes to stdin of the toself program WRITE SOME TEXT THAT GOES TO STDIN OF THE TOSELF PROGRAM More text MORE TEXT ^D proto%

If you want to try out these examples, locate the les server.self, socks.so and toself. The path name of the le socks.so is hardwired in the le server.self so please make sure that it has been set correctly for your system. Then le in the world and type server start & at the SELF prompt. Now you can go back to the UNIX prompt and try out the examples shown above.

90

SELF Virtual Machine Reference

Interfacing with other languages

32.8.1 Outline of toself toself is a small C++ program found in the le toself.c. It operates in the three phases outlined above: 1. Try to connect to a wellknown port number on a given machine (the function establishConnection does this). 2. Send the command line arguments over the connection established in 1 (the safeWrite call in main does this). 3. While there is more input and the SELF process has not shut down the socket connection, relay from stdin to the socket connection and from the socket connection to stdout (the function relay does this). 32.8.2 Outline of server The server is a SELF program. It is found in the le server.self. When the server is started, the follwoing happens: 1. Create a socket, bind a name to it and start listening. 2. Loop: accept a connection and fork a new process (both step 1 and 2 are performed by the method server start). The forked process executes the method server handleRequest which: a. Reads a line from the connection. b. Sets up a context with a slot stdio referring to the connection. c. Evaluates the line read in step a in this context. d. Closes the connection. 32.8.3 Foreign functions and glue needed to implement server The server program needs to do a number of UNIX calls to create sockets and bind names to them etc. The calls needed are socket, bind, listen, accept and shutdown. The rst three of these are only called in a xed sequence, so to make things easier, a small C++ function socket_bind_listen, that bundles them up in the right sequence, has been written. The accept function is more general than what is needed for this application, so a wrapper function, simple_accept, has been written. The result is that the server needs to call only three foreign functions: socket_bind_listen, simple_accept and shutdown. Glue for these three functions and the source for the rst two is found in the le socks.c. This le is compiled using gluecc and linked using glueld (see Makele). The result is a shared object le, socks.so. 32.8.4 Use of foreign functions in server.self The server program is implemented using foreignFct objects. There is only a few lines of code directly involved in setting this up. First the foreignFct prototype is cloned to obtain a local prototype, called socksFct, which contains the path for the socks.so le. socksFct is then

91

SELF Virtual Machine Reference

Interfacing with other languages

cloned each time a foreignFct object for a function dened in socks.so is needed. For example, in traits socket, the following method is found:
^ copyPort: portNumber = ( "Create a socket, do bind, then listen." | sbl = socksFct copyName: socket_bind_listen_glue. | sbl value: portNumber With: deadCopy. ).

This method copies a socket object and returns the copy. The local slot sbl is initialized to a foreignFct object. The body of the method simply sends value:With: to the foreignFct object. The rst argument is the port number to request for the socket, the second argument is a deadCopy of self (socket objects are proxies and socket_bind_listen returns a proxy, so it must be passed a dead proxy to revive and store the result in, cf. section 32.1). There are only three uses of foreignFct objects in the server and in all three cases, the foreignFct object is encapsulated in a method as illustrated above. In general the design of foreignFct objects has been aimed at making the use of them light weight. When cloning them, it is only necessary to specify the minimal information: the name of the foreign function. They can be encapsulated in a method thus localizing the impacts of redesigns. The complications of dynamic loading and linking are handled automatically. Likewise is the recovery from dead fctProxies.

92

SELF Virtual Machine Reference

VM configuration

Appendix F

VM conguration

The SELF system uses UNIX environment variables to congure itself when it starts up. The variables and their meanings are listed in the table below; all sizes are in bytes. If you do not dene one of these environment variables, its default value is used.
Table 6 Conguration variables
Name SELFDIR OLDSIZE Default value current directory 7,000,000 Description Directory containing the SELF sources, e.g. /usr/name/self/self. Initializes the option primitive _SelfDir. Size of old space, the area of memory where long-lived SELF objects are stored (see [Ung84], [Ung86]). This is the primary parameter to adjust depending on the size of your SELF world. It is generally safe to overestimate the size since the unused portions will not occupy real memory. However, this may result in excessive paging or long garbage collection pauses if too much memory is allocated for old space, because a lot of garbage may accumulate in old space, causing internal fragmentation and eventually a very long garbage collection. Too small of an OLDSIZE may cause the system to run out of space (and SELF will terminate). A typical warning symptom for this situation are frequent garbage collections that do not free much space. As a rule of thumb, old space should only be about 50-75% full after a garbage collection. Size of eden space, the area containing the most recently allocated SELF objects. Size of each of the survivor spaces (young objects). This size should be at least as big as EDENSIZE to avoid premature tenuring. Amount of memory reserved to cache compiled code (machine instructions). By varying this parameter (and the next three) you can trade space for (re)compilation overhead. Set CODESIZE to the desired value and then adjust PICSIZE, DEPSIZE, and DEBUGSIZE accordingly so that none of the latter memory areas overows before the code cache is full. (The exact ratio between CODESIZE, PICSIZE, DEPSIZE, and DEBUGSIZE varies from application to application.) Amount of memory reserved for polymorphic inline caches, a special form of inline caches designed to speed up polymorphic sends. Amount of memory used to record dependencies between source code and compiled code. Amount of memory used to hold debugging information for the compiled code.

EDENSIZE SURVSIZE CODESIZE

400,000 300,000 4,000,000

PICSIZE DEPSIZE DEBUGSIZE

720,000 6,100,000 3,100,000

As a rule of thumb, your machine should have at least 3 MB + CODESIZE + PICSIZE + OLDSIZE/ 2 + EDENSIZE + SURVSIZE of real memory available to SELF for good performance (i.e. roughly 8 MB for the default values shown above). Running under the new compiler may required a few more megabytes. More is better, of course. Turn on the system monitor while using the system to determine the correct sizes and to detect problems (see Appendix G for details). The amount of virtual memory used by SELF is at least 4 MB + CODESIZE + PICSIZE + DEPSIZE + DEBUGSIZE + OLDSIZE + EDENSIZE + 2*SURVSIZE; be sure to congure your system accordingly. (Many UNIX systems are precongured with relatively little swap space; see section 1.4 in the User Manual.)
93

SELF Virtual Machine Reference

The system monitor

Appendix G The system monitor


The SELF system contains a system monitor to display information about the internal workings of the system such as memory management and compilation. It is invoked with _Spy: true. When it is active, the system monitor takes over the bottom portion of your screen:

indicators

VM memory display

object memory

code cache

Caution: the system monitor always writes to /dev/fb, the consoles frame buffer. If you run SELF on another machine (e.g. via rlogin), the system monitor will write to the screen of the remote machine, not your own screen.

The indicators in the left part of the display correspond to various internal activities and events. On the very left are the CPU bars which show how much CPU is used in various parts of the system. The following table lists the individual indicators:
Table 7 The system monitor display: indicators
CPU Bar VM Lkup Comp Self CPU Dot Indicator What It Means CPU time spent executing in the VM, i.e. for primitives, garbage collection etc. CPU time used by compile-time and run-time lookups. CPU time spent by the SELF compilers. The black part stands for time consumed by the NIC, the gray part for the new compiler. CPU time spent executing compiled SELF code. The black part stands for time consumed by unoptimized code, the gray part for optimized code. This bar displays the percentage of the CPU that the SELF process is getting (a completely lled bar equals 100% CPU utilization by SELF). Black stands for user time, gray for system time. Below the CPU bar is a small dot which moves whenever a process switch takes place. What It Means

X-compiling YYY The X compiler (where X is either new or nic) is compiling the method named YYY into machine code. scavenge GC ushing compacting reclaiming sec reclaim ic ush The SELF object memory is being scavenged. A scavenge is a fast, partial garbage collection (see [Ung84], [Ung86], [Lee88]). The SELF object memory is being fully garbage-collected. SELF is ushing the code cache. SELF is compacting the code cache. SELF is reclaiming space in the code cache to make room for a new method. SELF is ushing some methods in the code cache because there is not enough room in one of the secondary caches (the caches holding the debugging and dependency information). SELF is ushing all inline caches.

94

SELF Virtual Machine Reference

The system monitor

LRU sweep page N read write disk in/out UNIX idle

SELF is examining methods in the code cache to determine whether they have been used recently. N page faults occurred during the last time interval (N is not displayed if N=1). The time interval currently is 1/25 of a second. SELF is blocked reading from a slow device, e.g. the keyboard or mouse. SELF is blocked writing to a slow device, e.g. the screen. SELF is doing disk I/O. SELF is blocked in some UNIX system call other than read or write. SELF has nothing to do. (shows up only when using processes.)

The middle part of the display contains some information on VM memory usage displayed in textual form, as described below:
Table 8 VM memory status information
Name RSRC C-Heap Description Size and utilization of the resource area (an area of memory used for temporary storage by the compiler and by primitives). Number of bytes allocated on the C heap by SELF (excluding the memory and code spaces and the resource area). Bitmaps used by the pixrect primitives, for example, are allocated on the C heap.

The memory status portion of the system monitor consists of bars representing memory spaces and their utilization; all bars are drawn to scale relative to one another, their areas being proportional to the actual sizes of the memory spaces. The next table explains the details of this part of the system monitors display. (Appendix F explains how to change the size of these memory spaces.)
Table 9 The system monitor display: memory status
Space object memory Description The four bars represent (from top to bottom) eden, the two survivor spaces, and old space. The left and right parts of the bar represent the space used by plain objects and byte vectors, respectively. The above picture shows a situation in which about half of old space is lled with plain objects and about 15% is lled with byte vectors. A small fraction of old spaces used portions is currently paged out (gray areas). These four bars represent the cache holding compiled methods with their associated debugging and dependency information. The cache represented by the leftmost bar contains the actual machine code for methods (including some headers and relocation information), the cache represented by the middle bar contains dependency information for the compiled methods, and the cache represented by the rightmost bar contains the debugging information. The three-way split reduces the working set size of the code cache. The cache represented by the small bar sitting on top of the leftmost bar contains polymorphic inline caches. Meaning Allocated, residing in real memory. Allocated, paged out. Unallocated memory.

code cache

Color black gray white

The segregation of (the vector of bytes in) byte vectors from other objects is an implementation detail improving scavenging and scanning performance (see [Lee88] and [CUL89] for details).

95

SELF Virtual Machine Reference

The system monitor

The residency information is updated only once a second for efciency reasons; all other information is updated continuously.

96

SELF Virtual Machine Reference

Primitives

Appendix H Primitives
Primitives are SELF methods implemented by the virtual machine. The rst character of a primitives selector is an underscore (_). You cannot dene primitives yourself, nor can you dene slots beginning with an underscore.

H.1 Primitive failures


Every primitive call can take an optional argument dening how errors should be handled for this call. To do this, the primitive is extended with an IfFail: argument. For example, _AsObject becomes _AsObjectIfFail:, and _IntAdd: becomes _IntAdd:IfFail:.
> 3 _IntAdd: a IfFail: [ | :error. :name | (name, failed with , error, .) printLine. 0 ] _IntAdd: failed with badTypeError. 0 The primitive returns the result of evaluating the failure block. >

If a primitive fails and the primitive call has an IfFail: part, the message value:With: is sent to the IfFail: argument, passing two strings: the name of the primitive and an error string indicating the reason for failure. If the failing primitive call does not have an IfFail: part, the message primitive:FailedWith: is sent to the receiver of the primitive call with the same two strings as arguments. The result returned by the error handler becomes the result of the primitive operation (0 in our example); execution then continues normally. If you want the program to be aborted, you have to do this explicitly within the error handler, for example by calling the standard error: method dened in the default world. The following table lists the error string prexes passed by VM to indicate the reason of the primitive failure. If the error string consists of more than the prex it will reveal more details about the error.
Table 10 Primitive failures
Prex primitiveNotDenedError primitiveFailedError badTypeError badTypeSealError divisionByZeroError overowError badSignError alignmentError badIndexError Description Primitive not dened. General primitive failure (for example, an argument has an invalid value). The receiver or an argument has the wrong type. Proxys type seal did not match expected type seal. Division by zero. Integer overow. This can occur in integer arithmetic primitives or in UNIX (when the result is too large to be represented as an integer). Integer receiver or argument has wrong sign. Bad word alignment in memory. The vector index (e.g. in _At:) is out of bounds (too large or negative).

97

SELF Virtual Machine Reference

Primitives

badSizeError reectTypeError outOfMemoryError stackOverowError slotNameError argumentCountError parentPriorityError unassignableSlotError lonelyAssignmentSlotError illegalPrivacyError parallelTWAINSError noProcessError noActivationError noReceiverError noParentSlot noSenderSlot deadProxyError liveProxyError wrongNoOfArgsError nullPointerError nullCharError prematureEndOfInputError noDynamicLinkerError EPERM, ENOENT, ...

An invalid size of a vector was specied, e.g. attempting to clone a vector with a negative size (see _Clone:Filler: and _CloneBytes:Filler: below). A mirror primitive was applied to the wrong kind of slot, e.g. _MirrorParentGroupAt: to a slot that isnt a parent slot. The result of an enumeration primitive was too large and could not be allocated. The stack overowed during execution of the primitive or program. Illegal slot name. Wrong number of arguments. Illegal parent priority. This slot can not be assignable. Assignment slot must have a corresponding data slot. Illegal privacy specication. Can not invoke TWAINS primitive (another process is already using it). This process does not exist. This method activation does not exist. This activation has no receiver. This activation has no lexical parent. This activation has no sender slot. This proxy is dead and can not be used. This proxy is live and can not be used to hold a proxy result. Wrong number of arguments was supplied with call of foreign function. Foreign function returned null pointer. Can not pass byte vector containing null char to foreign function expecting a string. Premature end of input during parsing. Primitive depends on dynamic linker which is not available in this system. These errors are returned by a UNIX primitive if a UNIX system call executed by the primitive fails. The UNIX error codes are dened in /usr/include/sys/ errno.h; see this le for details on the roughly 90 different UNIX error codes.

The _ErrorMessage primitive, sent to an error string returned by any primitive, returns a more descriptive version of the error message; this is especially useful for UNIX errors.

H.2 Available primitives


The following tables list the primitives currently dened in the SELF system. Most primitives have wrappers written in SELF, so programs do not normally call them directly.
Table 11 Cloning primitives
Name _Clone Description Return a clone (a shallow copy) of the receiver. Cloning is the only way to create new objects in SELF. Returns its receiver (not a copy) when sent to integers, oats, and canonical strings.

98

SELF Virtual Machine Reference

Primitives

_Clone:Filler:

Return a clone (shallow copy) of the receiver object vector, possibly resized. The receiver must be an object vector. The rst argument (an integer) species the length of the new vector, and the second argument species the initial value of extra elements if the result vector is longer than the receiver vector. Fails with badSizeError if the rst argument is negative. _Clone:Filler: is identical to _Clone if the rst argument is the same as the length of the receiver. Analogous to _Clone:Filler, but for byte vectors.The receiver must be a byte vector, and the second argument must be an integer in the range [0..255]. The integer is used to initialize new elements. Fails with badTypeError if sent to a canonical string.

_CloneBytes:Filler:

Table 12 Vector primitives


Name _At: _At:Put: Description Return the element of the receiver (an object vector) indexed by the argument (an integer). Vectors are indexed beginning with 0. Store into an object vector element. The receiver is the object vector, the rst argument is the integer index, and the second argument is the object to be stored. Returns the receiver. Returns an integer, the number of elements in the receiver object vector. Analogous to _At:, but for byte vectors. Returns an integer in the range [0..255]. Store into a byte vector element; analogous to _At:Put:. The value to be stored must be an integer in the range [0..255]. Fails if the receiver is a canonical string. Analogous to _Size, but for byte vectors. The receiver must be a byte vector, the rst argument a boolean and the second argument an integer index. The return value is a oat obtained by interpreting the bytes in the byte vector starting at the given index as either a C oat (if the boolean is false) or a C double (if the boolean is true). Note that precision may be lost. The number of bytes that a C oat and C double occupies is implementation dependant. The actual sizes can be found using the _BitSize primitive. Analogous to _CFloatDouble:At: but allows storing a oating point value at a given index in the byte vector. The last argument is the value to be stored. Returns the receiver. The receiver must be a byte vector, the rst argument an integer and the second argument an integer index. The return value is an integer obtained by interpreting the bytes in the byte vector starting at the given index as a C int type. The rst argument gives the size of this integer in bits. Not all bit sizes are supported, but the bit sizes corresponding to the C types char, short, int and long are guaranteed to be valid. Note: may fail with overow error. Analogous to _CSignedIntSize:At: but allows storing an integer value at a given index in the byte vector. The last argument is the value to be store. Returns the receiver.

_Size _ByteAt: _ByteAt:Put:

_ByteSize _CFloatDouble:At:

_CFloatDouble:At:Put:

_CSignedIntSize:At:

_CSignedIntSize:At:Put:

Since strings are special kinds of byte vectors, primitives taking byte vectors as arguments can usually take strings. The exception is that canonical strings cannot be passed to primitives that modify the object.

99

SELF Virtual Machine Reference

Primitives

_CUnsignedIntSize:At: _CUnsignedIntSize:At:Put:

Analogous to _CSignedIntSize:At: but interprets the bytes in the byte vector starting at the given index as a C unsigned int type. Analogous to _CUnsignedIntSize:At: but allows storing an integer value at a given index in the byte vector. The last argument is the value to be store. Returns the receiver.

Table 13 Arithmetic primitives


Name _IntAdd: Description Integer addition. Returns sum of receiver and argument. The range of integers is [-229..229-1], i.e. roughly 536,000,000 (standard 30-bit twos complement). May fail because of overow. Integer subtraction. Returns receiver minus argument. May fail because of overow. Integer multiplication. Returns product of receiver and argument. May fail because of overow. Integer division. Returns integer part of receiver divided by argument. May fail because of overow or division by zero. Integer modulus. Returns receiver modulo argument, with range 0 <= (n _IntMod: m) < abs(m). May fail because of division by zero. Returns bitwise complement (i.e. invert all bits) of receiver. Returns bitwise AND of receiver and argument. Returns bitwise exclusive OR of receiver and argument. Returns bitwise inclusive OR of receiver and argument. Shift receiver left by the number of bits indicated by the argument (an integer). Will fail with overowError if the resulting number is too large to be represented as an integer (equivalent to multiplying by a power of 2). Bitwise shift receiver left by the number of bits indicated by the argument (an integer). No overow will occur. Arithmetic right shift of receiver by the number of bits indicated by the argument (an integer). The sign bit is preserved. Logical right shift of receiver by the number of bits indicated by the argument (an integer). 0 is shifted into the most signicant bit. Return the integer receiver converted to a oat. Floating-point add. Returns sum of receiver and argument. The range of oating-point numbers currently is approximately [-4.3*10-9..4.3*109]; the precision is 6 decimal digits (this is IEEE standard 32-bit oat, but with two fewer exponent bits). Does not overow or underow (but may go to 0.0 or Inf, IEEE oating-point innity). Floating-point subtraction. Returns receiver minus argument. Does not overow or underow. Floating-point multiplication. Returns product of receiver and argument. Does not overow or underow.

_IntSub: _IntMul: _IntDiv: _IntMod: _IntComplement _IntAnd: _IntXor: _IntOr: _IntArithmeticShiftLeft:

_IntLogicalShiftLeft: _IntArithmeticShiftRight: _IntLogicalShiftRight: _IntAsFloat _FloatAdd:

_FloatSub: _FloatMul:

Integer arithmetic primitives take integer receivers and arguments; oating-point arithmetic primitives take oating-point receivers and arguments.

100

SELF Virtual Machine Reference

Primitives

_FloatDiv: _FloatMod:

Floating-point division. Returns receiver divided by argument. May fail because of division by zero. Floating-point modulus. Returns receiver modulo argument. If r is (x _FloatMod: y), then 0 <= r < abs(y), and (x-r)/y is an integral number (even though it might not be representable as a SELF integer). May fail because of division by zero. Return the greatest integral value less than or equal to the oating-point receiver (i.e. rounding towards negative innity). The result is a oating-point number. Return the greatest integral value greater than or equal to the receiver (i.e. rounding towards positive innity). The result is a oating-point number. Return the receiver truncated towards zero. The result is a oating-point number. Return the receiver rounded to the nearest integer. .5 is rounded to even, so 1.5 rounds to 2, and 2.5 also rounds to 2. The result is a oating-point number. Return the oating-point receiver rounded as in _FloatRound. The result is an integer. May overow.

_FloatFloor

_FloatCeil _FloatTruncate _FloatRound _FloatAsInt

Table 14 Comparison primitives


Name _Eq: _IntEQ: _IntNE: _IntLT: _IntLE: _IntGT: _IntGE: _FloatEQ: _FloatNE: _FloatLT: _FloatLE: _FloatGT: _FloatGE: Description Identity: test if the receiver and the argument are the same object. Integer equality. If two integers are _IntEQ: then they are _Eq:. Integer inequality. Integer less than. Integer less than or equal. Integer greater than. Integer greater than or equal. Floating-point equality. Two oating point numbers may be _FloatEQ: but not _Eq: (e.g. 0.0 and -0.0). Floating-point inequality. Floating-point less than. Floating-point less than or equal. Floating-point greater than. Floating-point greater than or equal.

Table 15 String-related primitives


Name _StringCanonicalize Description Return the canonical version of the receiver (a byte vector). All byte vectors containing the same sequence of bytes map to the same canonical string object, and only one canonical string object with a particular sequence of bytes ever

All comparison primitives return either true or false. Integer comparison primitives take integer receivers and arguments; oating-point comparison primitives take oating-point receivers and arguments. Remember that strings are special kinds of byte vectors.

101

SELF Virtual Machine Reference

Primitives

exists in the system. Therefore, two canonical strings can be tested for equality efciently using _Eq: rather than by comparing byte-by-byte. All string literals are canonical strings. _StringPrint _FloatPrintString _FloatPrintStringPrecision: Print the characters of the receiver, a byte vector, on stdout. Returns the receiver. Return the receiver, a oating-point number, formatted into an canonical string (similar to Cs sprintf("%g") format). Analogous to _FloatPrintString, but takes an integer argument that species the number of digits after the decimal point.

Table 16 General mirror primitives


Name _Mirror Description Returns a mirror on the receiver (any object). A mirror gives a view of an object that looks like a vector of slots. Mirrors are used to obtain information about aspects of objects that are not directly observable on the SELF level, for example, the names of an objects slots or the source code of a method. The object on which a mirror is created is called its reectee. The mirror answers all questions by inspecting its reectee. There are a different kinds of mirrors for different kinds of objects (see section 30), but all respond to the same set of primitives described below (with a few exceptions). _Mirror operates by cloning the mirror prototype appropriate for the type of its receiver, installing the receiver as the reectee of the cloned mirror, and returning the new cloned mirror. _MirrorReectee _MirrorReecteeIdentityHash _MirrorReecteeEq: _MirrorSize _MirrorNameAt: _MirrorContentsAt: _MirrorIsParentAt: _MirrorParentGroupAt: Return this mirrors reectee. Fails if invoked on a mirror on a method. Return the identity hash of the reectee of the receiver. See _IdentityHash in Table 23. Test if the receiver and argument are mirrors on the same object. See _Eq: in Table 14. Return the number (an integer) of slots in the reectee. Return the name of the specied slot of the reectee. The argument, an integer, species the zero-origin index of the slot. Return a mirror on the contents of the specied slot of the reectee. The argument, an integer, species the zero-origin index of the slot. Test if the specied slot is a parent slot. The argument, an integer, species the zero-origin index of the slot. Return the parent group (priority) of the specied parent slot. (The parent priority is a positive integer equalling the number of asterisks used when dening the slot). The argument, an integer, species the zero-origin index of the slot. Fails if the slot isnt a parent slot. Test if the specied slot is assignable. The argument, an integer, species the zero-origin index of the slot. Test if the specied slot is an argument slot. The argument, an integer, species the zero-origin index of the slot.

_MirrorIsAssignableAt: _MirrorIsArgumentAt::

_MirrorVisibilityAt:IfPrivate:IfPublic:IfUndeclared:

Unless otherwise noted, the receiver of a mirror primitive must be a mirror.

102

SELF Virtual Machine Reference

Primitives

Return the one of the three last arguments corresponding to the slots visibility (privacy declaration). _MirrorCode Sent to a mirror on a method, this returns a mirror on a byte code object representing the source code of the method. Fails if the reectee isnt a method.

Table 17 Primitives for enumeration


The following enumeration primitives will fail with primitiveFailedError if the limit is negative and with outOfMemoryError if the result vector is too big to be allocated. Enumeration primitives collect objects by scanning the object heap, and thus nd dead objects that have not been yet been reclaimed as well as live objects. To avoid getting such garbage in the result, perform a full garbage collection prior to the enumeration. _EnumerateVectorReferencesLimit: Receiver is a vector of mirrors on the target objects for which references are sought. Returns a vector of mirrors on objects that refer to any of the target objects. The limit (either a positive integer or oating-point innity) limits the maximum size of the result vector, to avoid running out of space inadvertently. A limit of oating-point innity species an unlimited enumeration. _EnumerateVectorImplementorsLimit: Receiver is a vector of selectors (strings) for the messages of interest. Returns a vector of mirrors on objects containing a slot that matches some target selector. The argument species the maximum size of the result vector.A limit of oating-point innity species an unlimited enumeration. _EnumerateAllLimit: Receiver is any object. Returns a vector containing mirrors on all objects in the system up to the limit specied by the argument. Using a limit of oating-point innity yields a vector of mirrors on all objects in the entire Self system. Receiver is an object vector. If the object vector is a literal vector it returns a mirror on the method referring the literal vector otherwise the primitive fails.

_MethodPointer

Table 18 Mirror primitives for programming and debugging


Name Description

_MirrorAtName:Put:Visibility:ParentGroup: Add or change a slot in the receiver mirrors reectee. The arguments are: the name of the slot (a string), a mirror on the new contents, the slot visibility (either ^, _, or ), and the parent group (number of asterisks) of the slot (0 = normal slot, 1 = highest priority parent, etc.). If the slot already exists, its attributes and contents are modied; if it doesnt exist, a new slot is added. Returns the receiver mirror. _MirrorRemoveAt: _MirrorAddSlots: _MirrorDene: _MirrorCodes _MirrorLiterals _MirrorSource Remove the slot at the given slot index (0 = rst slot). Returns the receiver mirror. The mirror version of _AddSlots:. The receiver is a mirror on the object to be changed, the argument is an object containing the slots to be added. The mirror version of _Dene:. The receiver is a mirror on the object to be changed, the argument is an object dening the new contents. Receiver is a mirror on a method. Returns the byte code vector. Receiver is a mirror on a method. Returns the literal vector. Receiver is a mirror on a method. Returns the source code of the method as a string.

103

SELF Virtual Machine Reference

Primitives

_MirrorFile _MirrorLine _MirrorEvalute:

Receiver is a mirror on a method. Returns the le name from where the method was parsed. Methods parsed at the prompt yields <prompt>. Reciever is a mirror on a method. Returns the line in which the method was parsed. Receiver is a mirror. Takes a mirror on a method as argument. Evaluates the method in the context of the reectee of the receiver.

The following primitives are special to activation mirrors; the receiver must be a mirror on a live activation, i.e. an activation which is currently active in some process. _MirrorReceiver _MirrorByteCodePosition _MirrorExpressionStack Return the receiver of this activation. Return the current position within the method. (Future releases will include a way to nd the current source position.) Return a vector containing the values of all expressions of a statement which have been evaluated but not yet consumed by any message send. For example, if an activation were suspended just before sending the + message in the statement i: i + 1, the expression stack would contain i and 1. Return the activations method holder, i.e. the object containing the slot whose evaluation created this activation. Return the name of the activation, i.e. the selector of the slot in which the activations method is stored. Return a mirror on the sender activation, i.e. the activation which created the receiver activation. This primitive will fail if the activation was created by the VM (e.g. for a doIt method). If the reectee of the receiver is a block activation, return the activation corresponding to the lexically enclosing scope. This primitive will fail for method activations, since method activations have no lexically enclosing scope.

_MirrorMethodHolder _MirrorSelector _MirrorSender

_MirrorLexicalParent

Table 19 Perform primitives


Name _Perform:{With:}* Description This group of perform primitives sends the message named by the rst argument (a canonical string) to the receiver, using the With: arguments as the arguments of the performed message, and returns the result. The number of With: parts must correspond to the number of arguments expected by the message named by the rst argument. Example: x _Perform: foo:Bar: With: 3 With: 4 has exactly the same semantics as x foo: 3 Bar: 4. Variants of _Perform do not ever evaluate their IfFail:argument upon failure (in the case where the IfFail: extension is used); instead, the error messages described in 29 will be sent if the selector isnt a canonical string or if the wrong number of arguments is passed. This may be changed in a future release. _PerformResend:{With:}* _Perform:DelegatingTo:{With:}* This group of primitives performs an undirected resend. The arguments are identical to _Perform:. A directed resend cannot currently be performed. This group of primitives performs a delegated send: the lookup starts at the object passed as the second argument. The rst argument is the message

The curly braces followed by a star indicate that With: can occur any number of times or not at all: possible selectors are _Perform:, _Perform:With:, _Perform:With:With:, and so on.

104

SELF Virtual Machine Reference

Primitives

selector, as above. The sender path tiebreaker rule is not applied. This is the only variant of _Perform that cannot be executed using normal message send syntax.

Table 20 Process primitives


Name Description

_NewProcessSize:Selector:Arguments: Returns a new process object which is obtained by cloning the current process object. The new process is not started. The rst argument is an integer giving a minimal stack size in bytes; the system may actually allocate more stack space. The right size depends on the future behavior of the process so some experimentation may be necessary. A good rst try could be 64 KB. The last two arguments are analogous to the arguments of _Perform:With:. They determine the rst message that the process sends when it is started. The receiver of _NewProcessSize:Selector:Arguments will also be the receiver of the rst message send of the new process. _ThisProcess _AbortProcess Return process object of current process. Ignores receiver. The receiver is a process object. The associated process is aborted; if it was the initial process, control will return to the VM prompt; otherwise, if the aborted process was the current process, the TWAINS primitive will return aborted. Otherwise (if the aborted process isnt the current process), _AbortProcess returns the receiver. Print the stack of the process associated with the receiver (a process object). The number of stack frames printed is determined by _StackPrintLimit. Return the number of activations on the process stack. Return a mirror on the activation whose number is given as an argument (0 = most recent activation).

_PrintProcessStack _StackDepth _ActivationAt:

_TWAINS:ResultVector:SingleStep:StopAt: Transfer and wait for next signal. The rst argument is a process object to transfer to, the second argument must be an object vector of size at least _TWAINSResultSize. Ignores receiver. Control is transferred to the indicated process. When control is transferred back the return value indicates the cause of the transfer: aborted, stackOverow, nonLifoBlock, yielded or signal. The result vector is used to provide additional information. If the cause is signal, the rst element of the result vector is modied to contain the number of signals and the rest of the result vector is modied to contain a list of the signals that accumulated in between returns from the primitive. The possible signals are: sigint, sigquit, sighup, sigwinch, sigio, siguser1, siguser2, sigpipe, sigterm, sigurg, sigchild, sigrealtimer and sigcputimer. The third argument must be either true or false and species whether the process is to execute in single-stepping mode. If the argument is true, the process will execute at most one message send before returning with a value of singleStepped. (If a signal occurs before the send could be executed, i.e. if the return value is signal, no step was executed.) The last argument is either nil or a stop activation. In the latter case, the process will stop with a return value of nishedActivation as soon as this activation nishes (the activation must be a live ctivation of the process). Note that TWAINS may return before this occurs (e.g. because of a signal).

105

SELF Virtual Machine Reference

Primitives

_TWAINSResultSize _Yield: _BlockSignals _SetRealTimer _SetCPUTimer

Maximum size required of the second argument of _TWAINS:ResultVector:. Gives up the CPU. Control is returned to the TWAINS process. Ignores receiver. Sent to true or false. Enables/disables signals. Set real-time interval timer. The receiver is an integer denoting the number of milliseconds per interval. Similar to _SetRealTimer, but sets the CPU timer.

Table 21 Programming primitives


Name _Print _VectorPrintLimit[:] _RunScript Description Print the receiver in a low-level format and return nil. (See section 10.3.2 in the User Manual). Controls the number of vector elements printed by _Print. Default: 20. Read in the le containing a list of SELF expressions and evaluate the expressions (see section 15.1 in the User Manual). The receiver is a string naming the le to be read. _RunScript always returns to the prompt (even if it is invoked within a program). This will be changed in a later release. Print a period when reading a script le with _RunScript. Default: false. Print the le name when reading a script le with _RunScript. Default: true. Controls the directory where _RunScript reads its les. Default: . (current directory) or the value of the SELFDIR environment variable (see Appendix F). Read in the snapshot named by the receiver, a byte vector (see section 6 in the User Manual). A snapshot may also be started from the UNIX prompt. _ReadSnapshot evaluates certain expressions before and after it executes; see section 28. _ReadSnapshot necessarily always returns to the prompt, even if invoked within a program. Write a snapshot to the le named by the receiver, a byte vector (see section 5.2 in the User Manual). Returns the receiver. _WriteSnapshot evaluates certain expressions before and after it executes; see section 28. Save compiled code with snapshot. Default: false. Dene slots of the receiver (see section 14.3 in the User Manual). _Define: always returns to the prompt (even if it is invoked within a program). This will be changed in a later release. Add all slots of the argument to the receiver (see section 14.1 in the User Manual). _AddSlots: always returns to the prompt (even if it is invoked within a program). This will be changed in a later release. Same as _AddSlots:, except that existing slots are never changed (see section 14.3 in the User Manual).

_PrintPeriod[:] _PrintScriptName[:] _SourceDir[:]

_ReadSnapshot

_WriteSnapshot

_SnapshotCode[:] _Dene:

_AddSlots:

_AddSlotsIfAbsent:

Bracketed colons indicate option primitives. The argument is optional (there are two versions of the primitive, one taking an argument and one not taking an argument). The brackets are not part of the primitive name. Option primitives ignore their receiver and return their current value (for the no argument version that queries its state) or their previous value (for the one argument version that sets its state). See section 31.

106

SELF Virtual Machine Reference

Primitives

_RemoveSlot:

Remove the designated slot from the receiver (the argument is a string, the name of the slot). See section 14.2 in the User Manual. _RemoveSlot: always returns to the prompt (even if it is invoked within a program). This will be changed in a later release.

_ParseObjectFileName:Line:Column:SilentPrematureEndOfInput: Parse the receiver, a string, as a Self object. The arguments lename, line and column are used to annotating parsed methods with source code information. The last argument is a boolean forcing the parser to be silent if end of string is encounted before an object is parsed. Returns the object created by the parser. _ParseObjectIntoPositionTable: Receiver is a string, source code for a method normally obtained using _MirrorCodes. Returns a vector containing two elements for each byte code associated with the method. For each byte code these two elements describe the corresponding text selection in the source code. The rst element is the selection start in the source, the second is the length of the selection. Controls the number of object references remembered by the system. Default: 1000. Returns the reference number of the receiver, assigning one if necessary. _ObjectID is the reverse to _AsObject Returns the receiver (an integer) converted to an object. The integer receiver is an object reference number displayed by _Print or a stack trace (see section 10.3.1 in the User Manual). Returns the receiver (an integer denoting an address) converted into an object. This primitive is used for low-level debugging (addresses of objects change upon scavenges and garbage collections). Controls the number of stack frames printed by _PrintProcessStack. Default: 10.

_NumObjectIDs[:] _ObjectID _AsObject

_AddressAsObject

_StackPrintLimit[:]

Table 22 System primitives


Name _FirstCompiler[:] _Recompiler[:] _MaxCompilePause[:] _MaxInvocationsBeforeRecompile[:] If _Recompiler != none and _Recompiler != _FirstCompiler, the generated code will contain a check to attempt a recompilation after _MaxInvocationsBeforeRecompile invocations. Default: 10,000. The current recompilation scheme is very simple and far from perfect. As a result, programs generated using recompilation may be signicantly slower than if compiled with the new compiler form start. To measure the top speed of the current compilers, use _FirstCompiler:new and _Recompiler:none. Description Species which compiler to use to create the initial compiled version of a method. The argument is a string, currently either nic (the default) or new. Species which compiler to use to recompile methods. Currently, valid arguments are nic, new (the default), and none. Takes an integer argument, the maximum number of milliseconds the compiler may use to compile one method. Default is 1000 (1 second).

Formerly _HistoryNumber. All system primitives other than option primitives (indicated in this table by [:] after their names) return (but otherwise ignore) their receivers. Option primitives return their previous value.

107

SELF Virtual Machine Reference

Primitives

_Scavenge _Tenure _PrintScavenge[:] _GarbageCollect _PrintGC[:] _Compact _Flush _FlushInlineCache

Force a scavenge (a fast, partial garbage collection; see [Ung84], [Ung86], [Lee88]). Tenure all objects. Force the memory management system to move all objects into the old heap. Receiver is ignored. Print a message at every scavenge. Default: false. Force a full garbage collection (this may take several seconds depending on your hardware and the size of old space). Print a message at every full garbage collection. Default: false. Compact the compiled code cache. Flush all compiled methods from the compiled code cache. Flush all inline caches. An inline cache caches the result of message lookups at the site of a message send. Inline caching speeds up subsequent executions of the particular send if the type of the receiver does not change. See [DS84] for details. Disable/enable inline caching. Disabling inline caching can slow the system considerably (see _FlushInlineCache). Default: true. Disable/enable inlining. Inlining is the process of inserting a copy of the callee in the callers code during compilation in order to avoid message sends. See [CUL89] for details. Disabling inlining will slow the system by a few orders of magnitude. Default: true. Turn on/off tracing of message sends. Only sends that actually result in a lookup (those sends that are not inlined or inline cached) are displayed. To see all sends, disable inlining and inline caching and ush all compiled methods and inline caches (see section 12.5 in the User Manual). Default: false. Turn on/off the system monitor, which continuously displays information about the state of the SELF system (see Appendix G). Spying incurs little runtime overhead. Default: false. Return the height (in pixels) of the screen area used by the system monitor. Prints out two histograms. The rst one is based on VM object types tha second based on the word size to the objects. The argument is a number dening the upper limit size in the second histogram. Receiver is a process. Activate/deactivate proling of SELF code executed by the receiver. The argument is boolean, true activates the proler. Proling incurs little run-time overhead. By default proling on a process is deactivarted. Reset the prole counters in order to start new measurements.

_InlineCache[:] _Inline[:]

_Trace[:]

_Spy[:]

_SpyHeight _PrintMemoryHistogram:

_Prole:

_ResetProle

_PrintProleCutoff:Skip:MaxDepth: Print the prole. Subtrees of the call tree using a smaller fraction of the total time than cutoff (specied as a oat, e.g. 0.02 = 2%) will be supressed. Furthermore, a method will only be displayed if its time (including the time of its callees) differs by more than skip from its caller. MaxDepth (an integer) species the maximum call depth to be displayed. As in the at prole, an inlined method is charged to its caller (i.e. inlining is not transparent). _ResetFlatProle _PrintFlatProle: Reset the at prole counters in order to start new measurements. Print the at prole; the argument is an integer specifying how many lines should be printed. The prole shows the time consumed by each compiled

Bracketed colons indicate option primitives. See footnote on page 106.

108

SELF Virtual Machine Reference

Primitives

method (including the time consumed by any methods inlined into the compiled method, but not including any time spent in the VM). _GenerateCountCode[:] Generate/do not generate code to count the number of actual (real) method invocations performed. Generating count code incurs some run-time overhead when enabled. Changing this setting only affects methods compiled after the change (see _Flush, earlier in this table). Default: false. Returns the number of method calls that were not mere data accesses made in methods that are instrumented to measure this (see _GenerateCountCode:). Returns the number of actual data access method calls made in methods that are instrumented to measure this (see _GenerateCountCode:). Print low-level debugging information about the memory system. Verify the integrity of the SELF virtual machine.

_NumberOfMethodCalls

_NumberOfAccessMethodCalls _PrintMemory _Verify

Table 23 Miscellaneous primitives


Name _IdentityHash _Restart _ErrorMessage Description Return an integer hash value for the receiver. The hash for a particular object is constant, but it is not unique (several objects might have the same hash value). Restart the current method, i.e. jump to the beginning of the method. Used to implement looping in blocks. Return a short description (a canonical string) of the receiver, which should be an error string returned by a primitive failure. Especially useful for UNIX errors. Example: E2BIG _ErrorMessage returns Arg list too long. _Manufacturer _Help _PrintOptionPrimitives _Quit _BitSize Return the name (a canonical string) of the manufacturer of the host computer. Print a list of useful primitives. Print a list of the available option primitives with a short explanation of their function and their current settings (see section 31). Leave the SELF system (equivalent to typing ^D (control-D) at the prompt). The state of the world is not saved. The receiver is a string designating a type. The primitive returns an integer, the number of bits in the represenation of that type. The following strings are valid receivers: self_int, self_float, char, short, int, long, float, double and void *. If sent to any other string the primitive fails. Prints out the credit message. The receiver is a string, a prex of a primitive error message as described in Table 10. Returns a string, a more descriptive version of the error message.

_Credits _ErrorMessage:

Table 24 UNIX primitives


Name _TimeUser Corresponding UNIX call Returns the user time (in milliseconds) used by the SELF system. Ignores receiver.

Unless otherwise noted, all of the miscellaneous primitives return (but otherwise ignore) their receivers.

109

SELF Virtual Machine Reference

Primitives

_TimeSystem _TimeCPU _TimeReal _DateTime:

Returns the system time (in milliseconds) used by the SELF system. Ignores receiver. Returns the total CPU time (in milliseconds) used by the SELF system (identical to _TimeUser + _TimeSystem). Ignores receiver. Real time (in milliseconds) since starting up SELF. Ignores receiver. Receiver is an integer, number of days after Jan. 1, 1970 describing the date. The argument is an integer, number of milli seconds describing the time of the date. Converts the date and time integers into local time returning a vector containing the current date and time as 7 integers: year, month, day, weekday (0 = Sunday), hour, minute, and second. syscall(receiver, arg0, ...). The receiver is an integer which species the function that the indirect system call should perform. Each pair of arguments to the primitive species an argument passed to syscall. The argument conversion is described in section 32.7.1.The return value is byte vector containing the result value.

_Syscall{With:And:}*

Table 25 Proxy and fctProxy related primitives


Name _Dlopen:ResultProxy: Description This primitive corresponds to the Sun OS call dlopen. It dynamically links in a shared object le. The receiver is a byte vector, the path of the shared object, the argument is an integer. The last argument is a proxy object. Returns the result proxy object, representing the loaded shared object. The primtive corresponds to the Sun OS call dlclose(). It unlinks a shared object. The receiver is a proxy. Always returns 0 if successful. The primitive corresponds to the Sun OS call dlsym(). It looks up a symbol address in a shared object. The receiver is a proxy representing the shared object, the argument is a byte vector, the name of the symbol. Note that symbol names in object les are not always exactly the same as in source code. For example, the C convention is to prepend an _ to the name in the object le. The last argument is a proxy object. Returns the result proxy with the address of the symbol. This primitive is similar to _Dlsym:ResultProxy:. However, it looks up a function and returns a fctProxy rather than a proxy. Warning: no attempt is made to ensure that the given name really refers to a function. The receiver is a proxy representing a shared object, the argument is the name of a function dened using glue code in the shared object. Returns the number of arguments this function takes. Note: the call will always return -1 for functions not dened using glue code. The receiver is a proxy or fctProxy object. Returns true iff it receiver is live. The receiver is a proxy or fctProxy object. Kills this object. Returns 0. The receiver is a proxy or fctProxy object. Returns true iff the pointer it encapsulates is NULL. The receiver and argument must both be proxy or fctProxy objects. Returns true iff they encapsulate the same pointers. The receiver is a proxy or fctProxy object. Return an integer hash value for the proxy. Note that this hash value is dependent on what the proxy is referring.

_Dlclose _Dlsym:ResultProxy:

_FctLookup:ResultProxy:

_NoOfArgsFct:

_ForeignIsLive _ForeignKill _ForeignIsNull _SamePointerAs: _ForeignHash

For notation see page 104

110

SELF Virtual Machine Reference

Primitives

_TypeSealResultProxy: _NoOfArgs

Receiver is a proxy or fctProxy object. Return proxy representing type seal of receiver. Receiver is a fctProxy. Return how many arguments should be supplied when calling this function. The value -1 designates a function that takes a variable number of arguments. Receiver is a fctProyx. Set how many arguments it should take when called. The value -1 will allow it to be called with any number of arguments. Warning: this call is potentially dangerous, since no attempt is done at checking that the given value is reasonable. Returns receiver. Calls the foreign routine it represents, and return its return value converted to a SELF object. The receiver is a live fctProxy. Calls to foreign routine it represents and return its return value converted to a SELF object The receiver is a live fctProxy. Passes the arguments to the foreign routine and calls the foreign routine it represents, and return its return value converted to a SELF object. Similar to _Call, but returns a byte vector that literally contains the bit pattern that the foreign routine returned. Similar to _CallAndConvert but passes arguments to the foreign routine. These arguments are determined by interpreting each pair of SELF level arguments using the any conversion described in section 32.7.1.

_NoOfArgs:

_Call _Call:{With:}*

_CallAndConvert _CallAndConvert{With:And:}*

For notation see page 104.

111

References

112

References

References

References
[CU89] Craig Chambers and David Ungar. Customization: Optimizing Compiler Technology for SELF, a Dynamically-Typed Object-Oriented Programming Language. In Proceedings of the SIGPLAN 89 Conference on Programming Language Design and Implementation, Portland, OR, June, 1989. Published as SIGPLAN Notices 24(7), July, 1989. Craig Chambers and David Ungar. Iterative Type Analysis and Extended Message Splitting: Optimizing Dynamically-Typed Object-Oriented Programs. In Proceedings of the SIGPLAN 90 Conference on Programming Language Design and Implementation, White Plains, NY, June, 1990. Published as SIGPLAN Notices 25(6), June, 1990. Also published in Lisp and Symbolic Computation 4(3), June, 1991. Craig Chambers and David Ungar. Making Pure Object-Oriented Languages Practical. In OOPSLA 91 Conference Proceedings, Phoenix, AZ, October, 1991. Published as SIGPLAN Notices 26(11), November, 1991. Craig Chambers, David Ungar, Bay-Wei Chang, and Urs Hlzle. Parents are Shared Parts of Objects: Inheritance and Encapsulation in SELF. In Lisp and Symbolic Computation 4(3), June, 1991. Craig Chambers, David Ungar, and Elgin Lee. An Efficient Implementation of SELF, a Dynamically-Typed Object-Oriented Language Based on Prototypes. In OOPSLA 89 Conference Proceedings, New Orleans, LA, October, 1989. Published as SIGPLAN Notices 24(10), October, 1989. Also published in Lisp and Symbolic Computation 4(3), June, 1991. Craig Chambers. The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages. Ph. D. dissertation, Computer Science Department, Stanford University, March 1992. L. Peter Deutsch and Allan M. Schiffman. Efficient Implementation of the Smalltalk-80 System. In Proceedings of the 11th Annual ACM Symposium on the Principles of Programming Languages, Salt Lake City, UT, 1984. Adele Goldberg and David Robson. Smalltalk-80: The Language and Its Implementation. Addison-Wesley, Reading, MA, 1983. Urs Hlzle, Craig Chambers, and David Ungar. Optimizing Dynamically-Typed Object-Oriented Programming Languages with Polymorphic Inline Caches. In ECOOP 91 Conference Proceedings, Geneva, Switzerland, July, 1991. Published as Springer Verlag LNCS 512, 1991. Urs Hlzle, Craig Chambers, and David Ungar. Debugging Optimized Code with Dynamic Deoptimization. In Proceedings of the ACM SIGPLAN 92 Conference

[CU90]

[CU91]

[CUC91]

[CUL89]

[Cha92]

[DS84]

[GR83] [HCU91]

[HCU92]

113

References

References

on Programming Language Design and Implementation, San Fransisco, June 1992. Published as SIGPLAN Notices 27(7), July, 1992. [Lee88] [Ung84] Elgin Lee. Object Storage and Inheritance for SELF. Engineers thesis, Stanford University, 1988. David Ungar. Generation Scavenging: A Non-Disruptive High Performance Storage Reclamation Algorithm. In Proceedings of the ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, Pittsburgh, PA, April, 1984. Published as SIGPLAN Notices 19(5), May, 1984 and Software Engineering Notes 9(3), May, 1984. David Ungar. The Design and Evaluation of a High Performance Smalltalk System. MIT Press, Cambridge, MA, 1987. David Ungar, Craig Chambers, Bay-Wei Chang, and Urs Hlzle. Organizing Programs without Classes. In Lisp and Symbolic Computation 4(3), June, 1991. David Ungar and Randall B. Smith. SELF: The Power of Simplicity. In OOPSLA 87 Conference Proceedings, Orlando, FL, 1987. Published as SIGPLAN Notices 22(12), December, 1987. Also published in Lisp and Symbolic Computation 4(3), June, 1991.

[Ung86] [UCC91] [US87]

114

SELF Manual

Index

A
activation object 27 alignmentError 97 ambiguousSelector:Type:Delegatee: MethodHolder:Arguments: 69 ancestor 17 anonymous parent slot 7 argument conversion 83 argument conversions 84 argumentCountError 98 assignment primitive 6 assignmentMirror 72 associativity of binary messages 14 of keyword messages 15 of unary messages 14

C_get_var 79 C_set_comp 80 C_set_var 79

D
data object 5 deadProxyError 74, 98 DEBUGSIZE 93 DEPSIZE 93 directed resend 16 divisionByZeroError 97 doIt 71 dynamic inheritance 28

E
EDENSIZE 93 encrypt.c 76 errno 79 errors runtime errors 69 evaluation of arguments 6 of blocks 8 of message sends 6

B
badIndexError 97 badSignError 97 badSizeError 98 badTypeError 97 badTypeSealError 97 binary message see message block 5, 8, 27, 72 block data object 7 block method 27 non-lifo block 27 non-local return 27 blockActivationMirror 72 blockMethodMirror 72 blockMirror 72 bytecodes 2 byteVector 71, 72 byteVectorMirror 72

F
failure (glue) 79 false 71 fctProxy 48, 71, 74 floating-point numbers, parent of 72 floating-point numbers, range of 100 floatMirror 72 foreign routines 74 foreignCode 48 foreignCodeDB 49 foreignFct 48 function proxy object 74

C
C glue 77 canonicalStringMirror 72 CC_delete 82 CC_new_N 81 character escapes 25 character set 23 cloning 28 closure 27 code 5, 6 CODESIZE 93 comments 26 compilation 2 configuration CODESIZE 93 DEBUGSIZE 93 DEPSIZE 93 EDENSIZE 93 OLDSIZE 93 PICSIZE 93 SELFDIR 93 SURVSIZE 93 conversion pair 83 customization 3 C++ glue 81 C_func_N 78 C_get_comp 80

G
glue 74 glue code 75 glueDefs.c.incl" 75

I
identifier 23 illegalPrivacyError 98 implicit receiver see message inheritance 28 dynamic inheritance 28 inline cache 108 inlining 108 integer 72 integers, range of 100

K
keyword see message

L
ld.so 76 Link 74 liveProxyError 98 lobby 70, 71

Index-115

SELF Manual

Index

lonelyAssignmentSlotError 98 lookup 28 lookup algorithm 18

M
memory requirements 93 message 27 binary message 14, 27 implicit-receiver message 15, 27 keyword message 11, 14, 27 message lookup 28 semantics 17 system-triggered message 68 unary message 13, 27 method 6 block method see block outer method 27 method activation object 6 method holder 27 sending method holder 27 mirrorMirror 72 mirrors 72, 102 mismatchedArgumentCountSelector:Type:Delegatee: MethodHolder:Arguments: 69 missingDelegateeSelector:Type:Delegatee: MethodHolder:Arguments: 69

N
nil 28, 71 noActivationError 98 noDynamicLinkerError 98 non-decimal number 24 non-lifo block 8 non-local return 8, 27 non-local return operator 5 noParentSlot 98 noProcessError 98 noPublicSelector:Type:Delegatee: MethodHolder:Arguments: 69 noReceiverError 98 noSenderSlot 98 nullCharError 98 nullPointerError 98

performTypeErrorSelector:Type:Delegatee: MethodHolder:Arguments: 69 PICSIZE 93 precedence of message sends 1415 prematureEndOfInputError 98 primitive 23 primitive failure codes 97 primitive failures 97 primitive send 17, 27 primitiveFailedError 97 primitiveNotDefinedError 97 primitives 97 arithmetic 100 cloning 98 comparison 101 mirror primitives 102 miscellaneous 109 programming primitives 106 Proxy and fctProxy related primitives 110 string-related 101 system primitives 107 Unix primitives 109 primitive:FailedWith: 97 printIt 71 privacy see slot privacy processes 105 processMirror 72 prototype 28 prototypes 2 proxy 48, 71, 74

Q
quitting SELF 109

R
read/write variable 10 reflectTypeError 98 resend 16, 23, 28 result conversion 86 root context 10, 28

S
selector 27 self 6, 11, 23 SELF world 2 SELFDIR 93 sending method holder 27 shell 71 slot 5, 27 anonymous parent 7 argument slot 6, 11, 24, 28 assignable data slot 10 assignment slot 6, 10 data slot 9, 27 initialization 912 parent slot 12, 28 privacy 9, 19 read-only slot 9 read/write slot 10 self slot 6, 11 slot privacy 28 slotNameError 98 slotsMirror 72

O
object 5, 27 data object 5, 27 method object 6 object literals 5 construction of 8 objVector 71 objVectorMirror 72 OLDSIZE 93 operator 24 outerActivationMirror 72 outerMethodMirror 72 outOfMemoryError 98 overflowError 97

P
parallelTWAINSError 98 parent slot 12, 28 parentPriorityError 98

Index-116

SELF Manual

Index

smiMirror 72 snapshot 74 snapshotAction 71 stackOverflowError 98 Static linking 76 strings 72 canonical strings 72 struct 80 SURVSIZE 93 system monitor (spy) 94 systemObjects 71 system-triggered messages 68

T
traits 2 traits object 7, 28 true 71 type seal 74

U
unary message see message unassignableSlotError 98 undefinedSelector:Type:Delegatee: MethodHolder:Arguments: 69 Unix error codes 98 unix_failure (glue) 79

V
variable see slot Virtual Machine see VM VM 2

W
WHAT_GLUE 77 wrapper 75 wrongNoOfArgsError 98

Z
^ operator see non-local return operator ^ (privacy specification) 9 ^ _(privacy specification) 9 _ (privacy specification) 9 _AbortProcess 105 _ActivationAt: 105 _AddressAsObject 107 _AddSlotsIfAbsent: 106 _AddSlots: 106 _AsObject 107 _At: 99 _At:Put: 99 _BitSize 109 _BlockSignals 106 _ByteAt: 99 _ByteAt:Put: 99 _ByteSize: 99 _Call 74, 111 _CallAndConvert 74, 111 _CallAndConvertWith:And: 111 _Call:With: 111 _CBreak: 109 _CFloatDouble:At: 99 _CFloatDouble:At:Put: 99

_Clone 98 _CloneBytes:Filler: 99 _Clone:Filler: 99 _Close 109 _Compact 108 _Credits 109 _CSignedIntSize:At: 99 _CSignedIntSize:At:Put: 99 _CUnsignedIntSize:At: 100 _CUnsignedIntSize:At:Put: 100 _DateTime: 110 _Define: 106 _Dlclose 110 _Dlopen:ResultProxy: 110 _Dlsym:ResultProxy: 110 _EnumerateAllLimit: 103 _EnumerateVectorImplementorsLimit: 103 _EnumerateVectorReferencesLimit: 103 _Eq: 101 _ErrorMessage 98, 109 _ErrorMessage: 109 _FctLookup:ResultProxy: 110 _FirstCompiler 73, 107 _FloatAdd: 100 _FloatAsInt 101 _FloatCeil 101 _FloatDiv: 101 _FloatEQ: 101 _FloatFloor 101 _FloatGE: 101 _FloatGT: 101 _FloatLE: 101 _FloatLT: 101 _FloatMod: 101 _FloatMul: 100 _FloatNE: 101 _FloatPrintString 102 _FloatPrintStringPrecision: 102 _FloatRound 101 _FloatSub: 100 _FloatTruncate 101 _Flush 108 _FlushInlineCache 108 _ForeignEq: 110 _ForeignHash 110 _ForeignIsLive 110 _ForeignIsNull 110 _ForeignKill 110 _GarbageCollect 108 _GenerateCountCode 109 _glueDefs.c.incl 75 _Help 109 _HirsoryNumber 73 _HistoryIndex 73 _HistoryNumber 107 _HostID 109 _IdentityHash 109 _Inline 108 _InlineCache 108 _IntAdd: 100 _IntAnd: 100 _IntArithmeticShiftLeft: 100 _IntArithmeticShiftRight: 100 _IntAsFloat 100

Index-117

SELF Manual

Index

_IntComplement 100 _IntDiv: 100 _IntEQ: 101 _IntGE: 101 _IntGT: 101 _IntLE: 101 _IntLogicalShiftLeft: 100 _IntLogicalShiftRight: 100 _IntLT: 101 _IntMod: 100 _IntMul: 100 _IntNE: 101 _IntOr: 100 _IntSub: 100 _IntXor: 100 _Kill 74 _Manufacturer 109 _MaxCompilePause 107 _MaxInvocationsBeforeRecompile 107 _MethodPointer 103 _Mirror 102 _MirrorAddSlots: 103 _MirrorAtName:Put:Visibility:ParentGroup: 103 _MirrorByteCodePosition 104 _MirrorCode 103 _MirrorCodes 103 _MirrorContentsAt: 102 _MirrorDefine: 103 _MirrorEvalute: 104 _MirrorExpressionStack 104 _MirrorFile 104 _MirrorImplementorsLimit: 103 _MirrorIsArgumentAt: 102 _MirrorIsAssignableAt: 102 _MirrorIsParentAt: 102 _MirrorLexicalParent 104 _MirrorLine 104 _MirrorLiterals 103 _MirrorMethodHolder 104 _MirrorNameAt: 102 _MirrorParentGroupAt: 102 _MirrorReceiver 104 _MirrorReferencesLimit: 103 _MirrorReflectee 102 _MirrorReflecteeEq: 102 _MirrorReflecteeIdentityHash 102 _MirrorRemoveAt: 103 _MirrorSelector 104 _MirrorSender 104 _MirrorSize 102 _MirrorSource 103 _MirrorVectorImplementorsLimit: 103 _MirrorVisibilityAt:IfPrivate:IfPublic:IfUndeclared: 102 _NewProcessSize:Selector:Arguments: 105 _NoOfArgs 111 _NoOfArgsFct: 110 _NoOfArgs: 111 _NumberCharsInFile 109 _NumberOfAccessMethodCalls 109 _NumberOfMethodCalls 109 _NumObjectIDs 73, 107 _ObjectID 107 _OpenFileFlags:Mode: 109 _ParseObjectFileName:Line:Column:SilentPrematureEndOfI

nput: 107 _ParseObjectIntoPositionTable: 107 _Perform 69 _PerformResend:With: 104 _Perform:DelegatingTo:With: 104 _Perform:With: 104 _Print 106 _PrintFlatProfile: 108 _PrintGC 73, 108 _PrintMemory 109 _PrintMemoryHistogram: 108 _PrintOptionPrimitives 73, 109 _PrintPeriod 73, 106 _PrintProcessStack 105 _PrintProfileCutoff:Skip:MaxDepth: 108 _PrintScavenge 73, 108 _PrintScriptName 73, 106 _Profile: 108 _Quit 109 _ReadSnapshot 68, 106 _Recompiler 73, 107 _RemoveSlot: 107 _ResetFlatProfile 108 _ResetProfile 108 _ResetSelfProfile 108 _Restart 109 _RunScript 106 _SamePointerAs: 110 _Scavenge 108 _SelectInto:Size: 109 _SetCPUTimer 106 _SetRealTimer 106 _Size 99 _SnapshotCode 73, 106 _SourceDir 73 _SourceDir: 106 _Spy 73, 108 _SpyHeight 108 _Spy: 73 _StackDepth 105 _StackPrintLimit 73, 107 _StringCanonicalize 101 _StringPrint 102 _SyscallWith:And: 110 _System 109 _Tenure 108 _ThisProcess 105 _TimeCPU 110 _TimeReal 110 _TimeSystem 110 _TimeUser 109 _Trace 108 _TWAINSResultSize 106 _TWAINS:ResultVector:SingleStep:StopAt: 105 _TypeSealResultProxy: 111 _VectorPrintLimit 73, 106 _Verify 109 _WriteSnapshot 68, 106 _Yield: 106 _^ (privacy specification) 9

Index-118

You might also like