We've had so many suggestions for Arc features that I thought
I'd better collect them all. So far I have barely skimmed most
of these, but there are clearly some interesting ideas here,
so I thought I should put them online so that everyone can see
them. --pg
*** Dorai Sitaram: Recursion
fn-named could be renamed rfn or fnr, r for
recursive. This gets rid of your one hyphenated name.
You can then have, a la Scheme's named let, the
form rlet. In the following, == notates "is syntax
for".
(rlet NAME VAR INIT BODY ...)
== (let NAME (rfn VAR BODY ...)
(NAME INIT))
Of course you need rwith too.
(rwith NAME (VAR0 INIT0 ...) BODY ...)
== (let NAME (rfn NAME (VAR0 ...) BODY ...)
(NAME INIT0 ...))
Of course you want tail-call elimination, and since you
can use tail recursion to iterate, you can remove your
four looping operators. I think this mayhem is OK,
because I have never thought of tail-recursive
loops as a poor human's substitute for "real" loops.
If you really must have the latter, you have macros
anyway.
*** Greg Sullivan: Keep, Sum, Syntaxes
Re the "keep" and "sum" operations, it seems like you could have
those take an optional function argument that would apply arbitrary
computation to the arguments, with the defaults being
(fn x . (add! x them))
and
(fn x (set! total (+ x total)))
or something. Seems like you're aiming for more usable versions of
map and fold.
More off the deep end is: what about optional front-end syntaxes?
Much as we all know that syntax really shouldn't matter, it seems to.
We all know that S-expressions are just a syntactic and direct
representation of the syntax tree -- could we have parsers from infix
syntax to S-expressions? I suppose the hard issue -- maybe the only
hard issue -- is macros. But I think we could solve it.
*** Alan Bawden: First-class Macros
During your presentation at LL1 last Saturday you mentioned that you were
considering macros that were first-class values. Many people in the
audience seemed to think that this was clearly an absurd idea -- I think
you even wavered about this yourself at one point. Well, don't give up on
this idea until you have read my paper about how first-class macros can be
a sensible notion. See .
The executive summary is that you can can have first-class macros and still
do all macro-expansion at compile time, at the cost of some (quite minimal)
"type" declarations. Unfortunately for you and your current project, I
wrote that paper with the statically typed programming language audience in
mind (ML, Java or even C). I do mention in passing that the idea would
also work for a dynamic programming language (Scheme or Dylan), but I don't
explain how in detail -- so you'll have to extrapolate a little.
I think the idea is quite cool, and, as I show in the paper, it has some
very interesting applications. (For example, it's almost all you need to
construct a complete a module system.)
*** Ken Anderson: (get)?
Some questions:
Q1: How do i write a function that takes two arguments that i want to get
the x and y fields of?
(def area ((get x y) (get x y)) ???)
They could be labeled by argument position so i could say:
(def area ((get x y) (get x y))
(abs (* (- x2 x1) (- y2 y1))))
Q2: If there is a get argument, (get x y), how can i access the underlying
object itself?
I think it is interesting to try to provide desctructuring information in the
function defnition. However, Haskell does this, for example, by allowing
multiple definitions of the function.
While i like some kind of pattern matching in function definition, i'm not
sure it should be part of the core language.
For the core i'd like to just see what Scheme provides, ie (x y . rest).
I'd even live fixed number of arguments if there is an easy way to define
macros.
*** Ken Anderson: Common Lisp arglist data
I put together these statistics from a common lisp application
27,935 Function and macros
6,328 Function or macros with keyword arguments
Distribution of number of arguments (no keyword case)
0=1254
1=12961
2=4063
3=1885
4=628
5=302
6=173
7=95
8=69
9=69
10=34
11=18
12=27
13>= 29
Distribution of keyword usage:
(&KEY)=2048
(&OPTIONAL)=1663
(&REST)=1176
(&BODY)=988
(&KEY &REST)=217
(&ALLOW-OTHER-KEYS &KEY)=60
(&REST &OPTIONAL)=51
(&KEY &OPTIONAL)=48
(&ALLOW-OTHER-KEYS &KEY &REST)=43
(&KEY &REST &OPTIONAL)=9
(&BODY &OPTIONAL)=6
(&REST &WHOLE)=6
(&BODY &WHOLE)=4
(&AUX)=4
(&ALLOW-OTHER-KEYS &REST &OPTIONAL)=2
(&OPTIONAL &WHOLE)=1
(&AUX &KEY)=1
(&WHOLE)=1
From this i'd conclude that most functions have 3 or less arguments and
that keywords are used less than about 1/3 of the time.
&key seems to be used as much as &optional and &rest combined.
*** David Morse: strings implemented as lists
In one of your arc articles you said you thought it "might be useful"
to have strings be implemented as lists.
The benifits I can think of would be:
* (subseq /str/ :start /n/) doesn't cons
* can share substructure among strings
* easy insert onto front of strings
The second and third don't seem like they'd be all that handy.
The first is attractive. Because its "arrays" are untagged, C strings
already have this property. Its often nice there. On the other hand,
using CL's :start and :end parameter passing pattern doesn't fill me
with dread either.
It seems like a tradeoff between tripling the number of parameters to
string functions versus approximately octupling the heap usage of
strings.
*** Todd Proebsting: Pronouns
Your comments on pronouns and iteration captures remind me of some
ideas that Ben Zorn and I threw around a few years back; you may find two TRs
interesting. One describes "shorthands" (pronoun-like things) and the other
describes "histories" (capture-like things):
ftp://ftp.research.microsoft.com/pub/tr/tr-2000-03.ps
ftp://ftp.research.microsoft.com/pub/tr/tr-2000-54.ps
*** Stephen Ma: Strings as Lists
Strings as lists: seems you just want to scan the string from left
to right, which is what the car and cdr functions let you do.
Instead of that, how about storing strings as vectors of bytes,
and using "scanners":
(a) A scanner is a mutable object that contains a string and a
current position within that string. String-matching functions
update the current position. Make a new scanner with
"(sc )".
(b) A scanner is completely distinct from its string:
a string can have any number of scanners attached to it, and
a scanner can be reattached at any time to another string.
(c) Scanners will be especially useful with the "match" macro,
which works a lot like Unix's Lex utility:
(match
else )
This attempts to match one or more regular expressions against
's current position. If one of the s match,
the corresponding gets executed. A successful match
advances the scanner's current position past the matched
string.
In each , the pronouns $1, $2, ..., et cetera refer to
matched substrings, just like Perl. $& refers to the entire
matched string.
In keeping with Arc's "make it easy" philosophy, if
is not a scanner but a string S, then S is implicitly replaced
by (sc S 0). So it's easy to do a quick match against a
string.
As I mentioned, a single invocation of "match" works much like
an instance of Lex, and like Lex can be precompiled to run
blazingly fast. Using more than one "match" lets you scan
strings in very powerful, context-dependent ways.
(d) Contrived example. Here's a function on a string s that
prints the initials of all the words inside s,
and returns the sum of all the integers inside s.
(Assume Perl-style regexps.)
(fn scan (s)
(= scanner (sc s 0))
(while (match scanner
"\s+" t ;Ignore whitespace
"(\w)\w*" (do (pr $1) t) ;Print the word's initial.
"\d+" (do (sum $&) t) ;Accumulate numeric sum.
"$" nil))) ;End of string terminates loop.
Thus (scan "alpha 2 7 beta 8") prints "ab" and returns 17.
*** Ray Blaak: Implicit Local Vars
I believe automatic variable introduction will hinder the programmer
more than help. "Expert" programmers type fast, and slightly mistyped
variable names would be hard to find visually, and there would be no
language support to detect them. This will make debugging much more difficult.
Instead, make the syntax for binding variables be succinct and clear (e.g.
as you have done with let). Perhaps allow a (def var value) form. The point is
that declaring bindings should be inexpensive to the programmer while still
allowing undeclared name detection.
I want my languages to be powerful and flexible, and easy to type, but I
also want decent error checking from my language environment as well. If there
is no other error detection in a language environment, *the* thing that I have
found important in almost 20 years of programming, more than type checking,
parameter matching, etc., is the reporting of unknown names.
My advice and plea: don't create local variables implicitly.
*** Ray Blaak: Use of *fail*
A cool hacker's language always needs to balance expressivity, ease-of-use,
and safety.
With regards to the use of *fail* for db lookups, I believe there is a
better way.
The problem with the use of a global variable to indicate the special
failure value is that it is not thread safe (Arc will be supporting threads
eventually, no?), and can lead to insidious bugs if some code fails to
restore the global properly. The code required to make it thread safe is
tedious and requires the programmer to be aware of potential problems on
every lookup.
Furthermore, if a non-default fail value is needed, the programmer must do
the *fail* override on every call, which is also tedious.
If instead one realizes that the fail value is a property of the db itself,
one can instead specify it when the db is created, perhaps with a :failure
keyword, that if omitted defaults to nil. Then different db's can have their
own notion of fail values simultaneously without interfering with each
other.
Usage is then easier (all lookups require the same, uniform work), safer,
and expressivity is actually improved since things can compose better.
Alternatively, one could also specify the fail value as an optional
parameter on a lookup call.
Note also that the idea of keyword parameters is in general a better way to
"fiddle" with settings, giving the control that hackers love while avoiding
the problems that special global control variables give rise to.
*** Shiro Kawai: M17N
Is there any plan for Arc to support multilingualization
(M17N)? Traditionally, M17N is considered by language designers
as some sort of add-on, library-level stuff. Yet it is
indispensable for the real world applications that are used
worldwide, and I think it does affect the implementation
decision of the language core to some degree, if you want efficiency.
In fact, out of frustration to the existing implementations
I've even written my own Scheme that supports multibyte strings
natively to do my own job.
Why is it important? Since the choice of implementation does
affect the programming style.
There may be some possibilities:
(a) A string is just a byte array. Application is responsible
to deal with multibyte characters. (It's pretty much
the situation until recently).
(b) A string is an array of wide characters, e.g. UCS-4.
(Note: UCS-2 is not enough).
The stream converts internal representation and external
representation. Java's going to this way, and I think
Allegro CL also adopts this after version 6.0.
(c) A string may contain multibyte characters, e.g. UTF-8,
or more complex structure that uses, for example, a
struct of language tag and byte vector.
String access by index is no longer a constant operation,
and a destructive change of string may be much heavier
than just a destructive change of a vector.
I think Python is going this way, as well as some other
popular scripting languages.
The approach (b) is simple and just works, and may be enough
for Arc's purpose. However, in some cases it doesn't works well,
especially when (1) you have to deal with large amount of text
data that mostly consists of single-byte characters and not want
to consume as 4 times of memory as byte string, and (2) you have
to do lots of text I/O, for most of external data will likely
be stay encoded in variable-length (multibyte) characters and
you have to convert them back and forth.
The approach (c) solves those problems, but it affects the
programming style. You tend to avoid random access of
string. Searching in string performs better if it directly
returns substring rather than indices.
Allocating a string by make-string and using string-set!
to fill it by a character at a time is a popular technique in
Scheme to construct a known-length string, but it behaves
extremely poorly in this implementation. I tend to use
string ports heavily and then the penalty of non-constant
access diminishes. Another advantage of this approach is
that you can have "double-view" of a string, one as a
sequence of characters and the other as a sequence of bytes,
which is convenient to write an efficient code to deal with
packets that mixes binary data and character data.
*** Vinay Gupta: Python, XML, SQL
I use Python quite frequently. Although I can't point to specific
features I think you should duplicate in Arc, I can think of specific
aesthetics which you should:
1> fixed code style with minimal punctuation
python's use of indentation (only) to mark blocks, and move towards to
banning of tabs, really does help move code between programmers and
encourage both reuse and learning-by-inspection. All python code looks
basically the same.
2> easy language support for putting basic documentation in code files
(python does it very nicely, with tools for pulling it all together
later).
A nice extension to that would be to have support for XML in the code
comments, so you can yank the XML of the comments out then format it at
whim without having to feed the text you pulled out of the code through
Word or some other manual step where things can get out of step.
On another note, I want to suggest that you seriously think about how
you build your interface to SQL into the system.
At this point, SQL is present on just about all servers in some form,
and Mac OS X is really beginning to push it down into the desktop.
But at present, access to SQL is rather a bolt-on in most languages, in
particular relative to file system access which is much more tightly
integrated.
I'd like to suggest that you consider integrating SQL support at much
the same level as file system support: as a conceptually integrated part
of the language design. I don't know quite where that might go, but I
suspect it to be a very interesting direction (particularly if the
abstraction which supports filesystems and SQL-based databases could be
extended to other, less common forms (XML databases being the obvious
one) by users).
*** Will Duquette: What CL Misses
What I'd like that I missed in Common Lisp.
1. Standard control structures. From what I can see, you've already
got this well under.... Well, under control, I guess.
2. Standard notation. As you comment, Unix won. It would be nice
to see standard notation for character constants (e.g., '\n' for
newline) and for format strings. "~A" isn't wrong, but C, C++,
Perl, Python, Tcl, awk, and so forth can all use C's sprintf format
strings.
These conventions might not be perfect, but they are familiar to
vast number of programmers.
3. Better string processing. I particularly like Tcl's variable and
command interpolation. For example, here's an interactive Tcl
session that defines a variable and a function, and interpolates
both into a string.
% set a "Hello"
Hello
% proc f {} {return "World"}
% set b "$a, [f]!"
Hello, World!
You're contemplating treating strings as lists where each character
is an element. One of the real conveniences of Tcl is that almost
any string can be manipulated as a list of white-space delimited
tokens. The "split" command can split up a string into a list of
tokens on any set of delimiters. This is, practically speaking,
incredibly useful. Common Lisp falls down here in two ways:
* There's no equivalent of the "split" command. (I've seen the
discussions about "partition", currently defined on cliki;
the word "over-engineered" comes to mind.) Anyway, the point isn't
that this can't be done in CL; just that it should be provided.
* You can't easily use a CL list of symbols as a proxy for a
string of white-space delimited tokens; the list of symbols has
too many syntactic gotchas, and then there's the whole case thing.
Which leads me to 4:
4. Case-sensitivity. Symbol names, especially, should be case sensitive.
5. Associative arrays. Hash tables are incredibly useful, but the
syntax to use them should be terse. Some kind of array notation
works very well. For example,
(= myDB[x] y)
assigns y to key x in hash table myDB. Tcl works much like this.
You also get multi-dimensioned arrays quite naturally using your
propose "x.y" notation: though I would suggest using "," and ";"
instead of "." and ":". Why?
x,y --> (x y) A cons
1,1 --> (1 1) A cons
1.1 --> 1.1 A floating point number
6. A more standard notation for record structures, or, more specifically,
for object members. C/C++/Java/Python syntax has become common for
this, and is more terse than either of the CL possibilities.
*** Dan Milstein: Concurrency
One problem which, IMHO, no popular language has come even
close to solving well is allowing a programmer to write multithreaded
code. This is particularly important in server-side programming, one of
Arc's major targets. I've written a good deal of multithreaded Java, and
the threading model is deeply, deeply wrong. As a programmer, there's
almost no one way to write the kind of abstractions which let you forget
about the details. You're always sitting there, trying to work through
complicated scenarios in your head, visualizing the run-time structure of
your program.
I didn't see another way until I read John H. Reppy's "Concurrent
Programming in ML". Instead of building his concurrency constructs around
monitored access to shared memory, he builds them around a message passing
model (both synchronous and asynchronous). What's more, he provides
powerful means of capturing a concurrent pattern in an abstraction which
hides the details.
I highly recommend giving that book a read. Here's an example of some of
what you get (not the abstraction, actually, just the basic power of
message-passing over shared memory). The abstraction facilities are
complex enough that, like Lisp macros, a small example doesn't really
capture their power. I'm in no way familiar with concurrent extensions to
Lisp, so I'm not able to provide the code for how much harder it would be
in CL or Scheme. Assuming they were augmented with a shared memory model
(as Java is), which forces the programmer to deal with synchrnoized access
to memory, I can only imagine it would be significantly more complex.
A producer/consumer buffer. You want a buffer with a finite number of
cells. If a producer tries to add an element to a full buffer, it should
block until a consumer removes an element. If a consumer tries to remove
an element from an empty buffer, it should block until a producer adds
something.
In Concurrent ML:
datatype 'a buffer = BUF of {
insCh : 'a chan,
remCh : 'a chan
}
fun buffer () = let
val insCh = channel() and remCh = channel()
fun loop [] = loop [recv inCh]
| loop buf = if (length buf > maxlen)
then (send (remCh, hd buf); loop (tl, buf))
else (select remCh!(hd buf) => loop (tl buf)
or insCh?x => loop (buf @ [x]))
in
spawn loop;
BUF{
insCh = insCh,
remCh = remCh
}
end
fun insert (BUF{insCh, ...}, v) = send (insCh, v)
fun remove (BUF{remCh, ...}) = recv remCh
---------------
Translated into a Lisp-ish syntax (very easy to do from ML), this would
look something like:
(defstruct buffer
ins-ch
rem-ch)
(defun create-buffer ()
(let ((ins-ch (make-channel))
(rem-ch (make-channel)))
(labels ((loop (buf)
(cond ((null buf)
(loop (list (recv ins-ch))))
((> (length buf) maxlen)
(send rem-ch (car buf))
(loop (cdr buf)))
(t
(select
(rem-ch ! (car buf) => (loop (cdr buf)))
(ins-ch ? x => (loop (append buf (list x)))))))))
(spawn #'loop)
(make-buffer :ins-ch ins-ch :rem-ch rem-ch))))
(defun insert (b v)
(send (buffer-ins-ch b) v))
(defun remove (b)
(recv (buffer-rem-ch b)))
Key things to notice:
1) Language features:
Communication between threads *only* occurs over channel objects, which can
be thought of as one-element queues. In CML, channels are typed, but in
Lisp they probably wouldn't be.
send/recv: synchronous (blocking) communication over the channel. A thread
can attempt to send an object over the channel, and will then block until
another thread does a recv on the channel.
Creating a new thread is done via 'spawn', which takes a function as its
argument. (I can't remember what the signature of that function is
supposed to be -- clearly, in this case it can't be a function of no
arguments, but imagine it to be something like that).
Selective communication: the call to 'select' is one of the very powerful
features. It is like a concurrent conditional -- it simultaneously blocks
on a list of send/recv calls, and executes the associated code with
whichever call returns first (and then drops the rest of the calls). The !
syntax means an attempt to send, the ? means an attempt to receieve. In
both cases, the '=>' connects the associated code to execute. (I haven't
really come up with a Lispy translation of that syntax).
I don't think select could be efficiently implemented without language
support. It requires a sort of 'partial blocking', which is tricky to
implement on top of normal blocking.
2) The Idiom
The buffer is implemented as a separate thread which has connections to two
channels and an internal list to keep track of the elements of the buffer.
This thread runs through a loop forever, taking the current state of the
buffer as its argument, and waiting on the channels in the body of its
code. It is tail-recursive.
Note the absolute lack of any code to deal with synchronization or locking.
You might notice the inefficient list mechanism (that append is going to
get costly in terms of new cons cells), and think that this is only safe
code because of the inefficient functional programming style. In fact,
that's not true! The 'loop' function could desctructively modify a list
(or array) to which only it had access. There would still be no potential
for sync'ing problems, since only that one thread has access to the
internal state of the buffer, and it automatically syncs on the sends and
receieves. It's only handling one request at a time, automatically, so it
can do whatever it wants during that time. It could even be safely
rewritten as a do loop.
What I find so enormously powerful and cool about this is that the
programmer doesn't need to worry about the run-time behavior of the system
at all. At all. The lexical structure of the system captures the run-time
behavior -- if there is mutating code inside of the 'loop' function, you
don't have to look at every other function in the file to see if it is
maybe modifying that same structure. This is akin to the power of lexical
scoping over global scoping. I have never seen concurrent code which lets
me ignore so much.
This really just scratches the surface of Concurrent ML (and doesn't touch
on the higher-level means of abstraction). But I hope it gives a sense of
how worthwhile a language it is to learn from.
3) Issues
I think that channels themselves would be fairly easy to implement on top
of the usual operating system threading constructs (without needing a
thread for each one). However, the style which this message-passing model
promotes can easily lead to a *lot* of threads -- if you have a lot of
buffers, and each of them has its own thread, things can get out of hand
quickly. I believe that Mr. Reppy has explored these very issues in his
implementation of CML.
http://cm.bell-labs.com/cm/cs/who/jhr/sml/cml/
Insofar as I have time (which, realistically, I don't) I would love nothing
more than to play around with implementing CML'ish concurrency constructs
in a new version of Lisp like Arc.
*** Ben Evans: Adjustable Competence
One idea which I've been thinking about a lot lately is "use strict;"
in Perl.
It's an example of how to write a general purpose language which can
still speak to the top of the intelligence curve of users.
What I was thinking of is a language with a directive to change the
competency level of a piece of code.
For example, in an imaginary Perlish language called Gimbal:
#!/usr/bin/gimbal
use beginner;
{
variables = "by default have lexical scope in beginner mode";
global VAR = "I must be prefigured by keyword global, and my name must
all be in upper case in this mode, otherwise, compile-time errors occur";
print variables + "\n";
}
print VAR + "\n";
use competent;
global dog = "I will generate a compile-time warning because my name
is not all in capitals in competent mode";
use expert;
scope global;
cat = "I generate no warnings at all, and the keyword global is not
required because I changed the default scope with the scope keyword";
....
Having a competency level pragma means that real hackers can get on
with their real work, while still having something which students etc
can use.
If this was combined with a code signing system (Eg, in 'use beginner'
mode, you can't use a class which isn't signed by someone who you can
establish a trust relationship to according to the keyring for your
local language install) then it might look a lot more appealing to the
large, mediocre hordes of programmers out there.
*** James Bullard: sequences, and range parameters.
One thing I especially like about Python are the
primitives used to specifiy range. For instance,
>> seq = [1,2,3,4,5]
>> seq[1:2]
[2]
>>> seq[1:3]
[2,3]
I could see this being easily incorporated into your
'sequences are an implicit method on indices' by also
adding 'and ranges'. Sytax, such as:
>> ("something" (1 . 3))
(o m)
Is a possibility. Mostly, I just wanted to suggest a
feature which I use all the time in Python.
*** Trevor Blackwell: Databases
Databases are an easy, low-risk way to store and index lots of data
without having to design a custom system for each kind of data.
When you store data in files, it's hard to get good
performance. Usually you either end up reading many files at each
click, or reading them all in at startup time and taking a lot of time
and memory.
Databases also ensure correct concurrent access, so you can have
multiple users working simultaneously on the same data without
conflicts.
*** Eric Tiedemann: Python Syntax
Python is one of my languages of choice at the moment because of it's
terse, simple, and regular syntax. Those attributes could come in
handy for a macro-friendly syntax-for-Lisp. esr has a continuing
interest in this subject.
As far as I can figure, *all* statements in Python (aside from import
and print) are either expressions or assignments (which we could
`promote' to expressions) or have one of the two forms:
:
Where -> simple_stmt | NEWLINE INDENT stmt+ DEDENT
This even includes class definitions!
The major complexity I can see is that some macro expressions (e.g.,
if, try) would want to incoporate expressions *after* their containing
expression (e.g., else, except). This could be handled by have macros
(optionally) take and return both an expression and an `expression
continuation'.
*** Guyren Howe: OOP
In response to your request for new ideas for Arc, I'm going to mention an
OOP feature from the language REALbasic ().
It's a new type of class extension mechanism: as well as allowing for
overloading/method overriding, a class can define a New Event that allows it
to call its subclasses. An event declaration is just a function header.
The most immediate subclass to implement a handler for the event traps the
event and makes it disappear for further subclasses, unless the class that
defined the handler re-defines and calls the event with the same name.
The result is that rather than extension by uncontrolled, ad-hoc overriding,
you get extension through a well-defined protocol.
It's a nice mechanism.
Example: you have a NiceButton object. A TextNiceButton is defined that
writes text onto the face of the button.
Now you want to draw the NiceButton in a different way (with a gradient,
say).
If TextNiceButton was written with traditional overloading, it might be that
the drawing of the text happens at the wrong time, and if you try to draw
the gradient you might overwrite the text. But if TextNiceButton was
responding to a DrawText event issued by NiceButton, you can control when it
is drawing its text, so you can make sure you've drawn your gradient first.
When I describe this to people, they react by saying that it is strictly
weaker than overriding methods, and this is true. But for most purposes,
having a well-defined protocol is nice.
*** Sam Tregar: Perl Community
You mentioned the Perl library many times but rarely mentioned the Perl
community. It's quite obvious to me that Perl would not be where it is
today, with the library it has today without the Perl community. Also,
from reading Larry Wall's various speeches and essays it's obvious to me
that the Perl community is no accident. Larry designed the Perl community
and the result is the powerhouse that's resulted in the Perl we know
today. Simply: the Perl library was not carefully designed; the Perl
community was carefully designed and the Perl community created the Perl
library. Perhaps this is a lesson Arc can follow?
*** Drew Csillag: Misc Suggestions
You should be able to call a function with keyword arguments without
having to define the function to be able to take them (Python vs. CL)
Reference counted GC is a great thing (but still needs some
supplemental GC to pick up the cyclic trash) because it's very
predictable which is very useful if you've ever had to open up a bunch
of files in a tight loop before.
Multiple inheritance is a useful thing. Not often, but there are a
number of times it can save you a bunch of headaches. Especially for
mixin classes.
Also, builtin types should be largely indistinguishable from user
classes. You should be able to write a class in such a way that you
could do (+ userob1 userob2) and there would be a method in either one
or the other (maybe both) object's class to handle this.
Introspection into the system is a wonderful thing. Being able to
write a profiler and debugger for your language in your language is an
great thing to be able to do. It allows all sorts of cool tracing and
debugging (coverage analysys) things you'd ever want to do. Being
able to get a hold of the namespaces you are executing in is also a
huge boon I learned from Python. If you can also introspect into the
compiled forms of functions (assuming that they compile to some
bytecode object) is a great thing too.
Some of the iteration stuff seems a bit ooky to me though. While,
with the it binding is cool, but part of it doesn't sit well with me.
Perhaps because you wouldn't be able to get at it if you are in a
nested while loop or something. Also each with keeps is more of a
filtering function than straight iteration (like Python's filter()).
As for things like scheme-like continuations, for me it's hard to say
yes, but hard to say no because they are useful for building exception
handling systems (if you don't include some kind of exception handling
system (built with or without explicit continuations) -- which you
should -- runtime errors should raise exceptions, not just halt the
system) and microthreading systems.
The interpreter should also be able to be multithreaded, but unlike
Java, it shouldn't force it on you. I'm not big on multithreading in
general, but sometimes it actually is the best way to do things.
Sather-like (I think it's Sather) generators are also very nice things
to have.
One last suggestion: the interpreter should be written in C. Language
systems (with the exception of C) that self host are a pain in the
neck.
*** Thomas Baruchel: Call/cc vs. Exceptions
Though call/cc is better than exceptions, I think it's much easier
to use exceptions than call/cc. Why ? Because you can more easely give a name
to an exception (after having declared it) and catch it. Do you see what
I mean? What is very clean in OCaml is the structure:
OBtry
...
...
with
Division_By_Zero -> ... ;
IncoherentParameter -> ...;
FunnyResult -> ...;
WhyDidYouTryThat -> ...
(ie you can put in a separate place everything that can be raised as an
exception and handle with it separately).
I think that something as powerful as call/cc but as convenient to use
as try/with would be a great idea ;-)
Another limitation of Scheme is that it isn't intended to implement new
types (or at least, it isn't very easy to do it). I think that many
languages do it better (OCaml, or even C). That is a big lack:
I am now playing with quantum computation, and found it is very easy
to implement a 'qubit' type in C or in OCaml and not in Scheme (what I do
is use a pair of two complex numbers, and it isn't bad, but...)
On the same idea, some languages allow definition of functions with
several return values (not necessarely in a structure), by
declaring input and output values of the function. This is quite
useful for some purposes.
Prolog: why not implement some declarative aspects?
(at least the lisp 'amb' function, but implemented at a low level, which
would make it easier to use (amb-fail isn't a so good idea: Prolog
doesn't use anything like 'amb-fail'; whenever a condition isn't satisfied
it does the same thing by itself and the programmer hasn't to care about it).
Another thing you can't do when using 'amb' after having implemented it by
yourself in Scheme is using several 'amb' declarations nested...
Think that some tools are more than tools and almost are languages.
A CLP(FD) solver has been implemented in GNU/Prolog and it isn't a
bad idea... But this has to be done by considering the general concepts
of the language. clp(fd) suits well the declarative aspects of the
Prolog.
Forth like languages: the concept of stack is really impressive,
because you can play with much data without using any variables.
MY CREDO: a good language doesn't have to use variables. I like
functionnal languages for that; I also like Forth (and postscript)
for the same reasons.
*** David Sloo: Strings
The fact that strings behave as lists is convenient. It means there is a
one category less to memorize.
However, one of the problems I keep running into in language interop is
string processing, even in Perl. Everytime I move a string from a module
in language X into an application, I have to write a UTF8 processor.
Worse, half the time I need to use someone else's code within an
application in the same language, I have to check for Shift-JIS, or do
UTF8 processing, or some similar dopiness. A string in the language
should be treated as an array (in Arc, a list) of Unicode characters no
matter what. There are a few good things about C#, and this is one of
them. There are a few bad things about Perl, and this is also one of
them.
Also, I think you made a logical decision in scoping the iterators
locally in for loops, but it does mean you need extra stuff for this
common idiom, where you preserve the exit-state of the loop:
for (int i = 0; i < 10; ++i)
if (array[i] == UNSNOOGLE)
break;
printf("The array of snoogles had a un-snoogle at spot %d", i);
I suppose it's worth changing, given the downside.
*** Dave Moon: S-expressions are a bad idea
I want to comment on your use of S-expressions, based on what I learned
in my couple of years researching Lisp hygienic macro ideas and working
on Dylan macros. Summary: I learned that S-expressions are a bad idea.
There are three different things wrong with them, none of which have
anything to do with surface syntax. Having as a convenient abbreviation
a surface syntax different from the program representation that macros
work on is a fine thing to do.
[1] The (function . arguments) notation is not extensible. In other
words, there isn't any place to attach additional annotations if some
should be required. 20 years ago it was noticed that there is no place
to attach a pointer back to the original source code, for use by a
debugger. That's just one example. This is easily patched by changing
the notation to (annotation function . arguments), where annotation is a
property list, but this is a bit awkward for macro writers. The next
point suggests a better fix.
[2] Representing code as linked lists of conses and symbols does not lead
to the fastest compilation speed. More generally, why should the
language specification dictate the internal representation to be used by
the compiler? That's just crazy! When S-expressions were invented in
the 1950s the idea of separating interface from implementation was not
yet understood. The representation used by macros (and by anyone else
who wants to bypass the surface syntax) should be defined as just an
interface, and the implementation underlying it should be up to the
compiler. The interface includes constructing expressions, extracting
parts of expressions, and testing expressions against patterns. The
challenge is to keep the interface as simple as the interface of
S-expressions; I think that is doable, for example you could have
backquote that looks exactly as in Common Lisp, but returns an
rather than a . Once the interface is separated from
the implementation, the interface and implementation both become
extensible, which solves the problem of adding annotations.
So you don't get confused, I am not saying that Dylan macros provide a
good model of such an interface (in fact that part of Dylan macros was
never done), nor that Dylan macros necessarily have any ideas you should
copy. I'm just talking about what I learned from the Dylan macros
experience. Actually if you succeed in your goals it should be
straightforward for a user to add Dylan-like macros to Arc without
changing anything in Arc.
Maybe someday I can explain to you the representation of programs I used
in the Dylan compiler I wrote (never finished) after I left Apple. It
has some clever ideas so that it is efficient in terms of consing and the
same representation can be used in all levels of the compiler from the
surface syntax parser down to the machine code generator. I'd have to
retrieve the code from backup first.
[3] Symbols are not the right representation of identifiers. This
becomes obvious as soon as you try to design a good hygienic macro
system. This is really another case of failing to separate interface
from implementation. A symbol can be a convenient abbreviation, but in
general an identifier needs to be a combination of a name and a lexical
context. Because of macros, the structure of lexical contexts is not
simple block-structured nesting: an identifier can be referenced from a
place in the code where it is not visible by name. Also an identifier
can be created that doesn't have a name, or equivalently isn't visible
anywhere just by name, only by insertion into code by a macro.
So I think you should have an internal Arc-expression representation for
programs, and a surface syntax which is more compact and legible, but
Arc-expressions should not be specified to be made out of conses,
symbols, and literals. Arc-expressions should be defined only by an
interface, which should be open (i.e. new interfaces can be added that
apply to the same objects). The implementation should be up to the
compiler writer and there should be room for experimentation and
evolution. Interesting question: is there a surface syntax for
Arc-expressions different from the abbreviated surface syntax, and if so,
what is it?
Other unrelated comments:
"Here are a couple ideas:
x.y and x:y for (x y) and (x 'y) respectively."
Do you mean x.y stands for (y x)? Oh, I see, you have field names as
indices rather than accessors, so you actually meant x.y stands for (x
'y).
As for x:y, I think what Dylan borrowed from Smalltalk is the right
answer, thus x: stands for (quote x). Colon as a postfix lexical
"operator" works surprisingly well.
"local variables can be created implicitly by assigning them a value. If
you do an assignment to a variable that doesn't already exist, you
thereby create a lexical variable that lasts for the rest of the block."
This is a really bad idea, because inserting a block of code that
includes a local variable declaration into a context where a variable
with that name already is visible changes the declaration into an
assignment! Or if you fix that by making an assignment at the front of a
block always do a local declaration instead of an assignment, then when
you put code at the front of a block you have to remember to stick an
"onion" in front of that code if there is any chance that the code being
inserted could be an assignment.
Weird context-dependent interpretation of expressions like that makes a
language harder to use, both for programmers and for macros. It's one of
the problems with C, don't put it into Arc too.
I'm not saying that you need to use Common Lisp's let for local
declarations. One possibility would be to use = for declaration, := for
assignment, and == for equality. Another would be to use = for all
three, but with a prefix set for assignment and let for declaration.
Other possibilities abound.
But then later you introduce a let macro, so I don't know what you really
think about local declaration.
"Making macros first-class objects may wreak havoc with compilation."
That depends entirely on whether it is permissible to write functions
that return macros as values and permissible to pass macros as arguments
to functions that are not locally defined (or directly visible fn
expressions), or whether all macros are directly visible at compile-time
as initial values of variables. I think you should limit the language to
what you know how to implement, so the macro special-form should only be
allowed in places where its complete flow is visible at compile time and
should never be allowed to materialize as a run-time object. You need a
different way to create a macro object that exists in the run-time in
which macros run, the compile-time run-time. You need to define
precisely "flow visible at compile time," possibly being very restrictive
and only allowing what we really know is needed, namely a macro special
form can be used in function position, as an argument in a function call
where the function is a fn special form, in the initial value of a global
constant (not variable!), and nowhere else. This is tricky but tractable
I think.
"strings work like lists"
does rplaca [probably (= (car x) y] work on strings? What about rplacd?
"Overload by using a fn as a name."
I think you need to rethink this way of defining methods, it has a lot of
problems.
*** Dave Moon: Object-Oriented
Re http://www.paulgraham.com/reesoo.html
When I say object-oriented, I am often referring to yet a different
property, which I think is a good property for most languages to have.
It's not the same as your "5.Everything is an object" although it might
sometimes be called "Everything is an object":
Things are independent of their names. To put it another way, as much as
possible of the machinery is exposed as an object that gets named by
binding an identifier to it, and you operate on the object, not on the
name. Classes work this way in CLOS, they don't work this way in C++.
Modules work this way in Dylan, modules (packages) don't work this way in
Common Lisp, modules (packages) almost but don't quite work this way in
Java. Method invocation works this way in CLOS (generic functions are
objects), it doesn't work this way in Smalltalk and Objective C (method
selectors are like Lisp symbols). I think most things done in Scheme
work this way.
Arc probably follows this principle even though you claim it's not
object-oriented.
One advantage is if you don't like the name of something you can rename
it by manipulating identifier bindings; there are no or few reserved
words. (I decided that "reserved words" are words that we don't want to
hear our children say.)
Another advantage is you can make things anonymous. It also might become
easier to expose the guts of the language for extension by users ("meta
objects").
Another advantage is the language gets the conceptual simplicity that
comes from making things orthogonal. I think that is the real advantage.
*** Dave Moon: Arc Syntax
Here is how I would do syntax for Arc or a language like it. I hope you
find these ideas useful.
The lexical grammar is hard-wired, and is the same for the surface syntax
and for the S-expression syntax. I have my ideas about what the lexical
syntax should be, but I'll omit them here.
The lexical grammar produces only two types of tokens: literals and
names. A name can look like an identifier in other languages, but also
can look like an operator, a dot, a semicolon, a comma, or even a
parenthesis.
The meaning of a literal is built into the language. The meaning of a
name is defined by a binding. Bindings can be top-level or local. The
language comes with a bunch of top-level bindings that define the
standard language, but users can add more and can replace the standard
ones to customize the language. Top-level bindings should be organized
into modules, but that's a separate topic.
A top-level binding consists of one or more of the following: a value,
which is either an ordinary runtime value or a macro; a type declaration
if you allow those; a constant declaration if you allow those; a dynamic
binding declaration (special in Common Lisp) if you allow those; a syntax
declaration.
The phrase grammar for the surface syntax is built up by syntax
declarations from the following hard-wired elements:
- token - a literal or a name
- literal - a literal
- string - a character string literal
- keyword - a Dylan-like symbol literal, identifier colon
- name - a token that is not a literal
- word - a name that does not have a syntax declaration
- expression - one of:
+ a literal
+ a word
+ a syntax form consisting of a name with a syntax
declaration followed by the expected syntax
+ a compound expression constructed from subexpressions,
prefix operators, and infix operators, with ambiguities
resolved by operator precedence
I think the grammar as declared by syntax declarations has to be LR(1)
for this surface syntax project not to be sunk by grammatical
ambiguities. That remains to be considered in detail.
Note that there is no built-in distinction between statements and
expressions. Of course a user could add a syntax form whose body is
composed of statements. I wouldn't do that myself.
The phrase grammar for S-expression syntax is hard-wired. I won't
discuss it in this email except to use square brackets to indicate
S-expressions. Having a syntax for S-expressions is unnecessary except
for bootstrapping and debugging, unless some programmers prefer to bypass
the surface syntax for unfathomable reasons.
There are two forms for syntax declaration: defsyntax and defoperator.
These specify how particular constructs in the surface syntax are parsed
and converted into S-expressions. They control the parser by
establishing bindings of names to syntax declarations. In most cases the
name will appear in the S-expression, so the name must also have a value
binding to a function or a macro to give the S-expression a meaning.
defsyntax declares a syntax form. It is followed by the name that
introduces the syntax form and a grammar description for the syntax form.
The grammar description implies how to translate the parsed surface
syntax into an S-expression.
defoperator declares an operator that can be used to make compound
expressions, specifies whether it is prefix or infix or both, and for
each case (prefix and infix) specifies its operator precedence and a
grammar description for what appears to its right, defaulting to
expression. defoperator is general enough to define things like C's
semicolon and parentheses, see below.
defsyntax and defoperator are themselves syntax forms as well as macros,
and the grammar description language is general enough to define them.
See below.
A grammar description is a sequence of items selected from the seven
hard-wired elements of the phrase grammar mentioned above, plus literal
tokens represented as character strings, plus the following six grammar
"constructors":
- repeat(grammar) - zero or more repetitions of the subgrammar
- optional(grammar) - zero or one repetition of the subgrammar
- or(grammar,grammar, ...) - one of the subgrammars
- noise(grammar) - the subgrammar is omitted from the S-expression
As a special case, noise by itself at the beginning of a grammar
description means that the name of the syntax declaration is
omitted from the S-expression
- recursive(name,grammar) - the subgrammar, but inside the subgrammar
the specified name means the same subgrammar
- error(string) - report a syntax error
Note that the meaning of names in grammar descriptions is not defined by
bindings, these are just treated as literal symbols. This is mostly to
avoid name conflicts but also might be necessary for bootstrapping
reasons. I could have used BNF-like notation with brackets, bars, and
stars instead, but I personally find it more readable to use spelled out
names and I don't think conciseness is a virtue here since relatively few
grammar descriptions will be written.
Note that when defining a macro with surface syntax different from the
syntax of a function call, you use both defsyntax and defmacro. One
defines the translation from surface syntax to S-expressions, the other
defines how to compile (or interpret) the S-expression. I suppose there
could be a macro that combines the two.
Now for some examples to try to make this comprehensible.
One way to define a cond-like if "statement" would be:
defsyntax if (expression noise("then") expression
repeat(noise("elseif") expression noise("then") expression)
optional(noise("else") expression))
if a then b elseif c then d else e
parses into
[if a b c d e]
If you prefer a more C-style if "statement", with parentheses instead of
then as the noise necessary to avoid the syntactic ambiguity that occurs
when two expressions are adjacent:
defsyntax if (noise("(") expression noise(")") expression
optional(noise("else") expression))
if (a) b else c
parses into
[if a b c]
You could even tack noise("end") optional(noise) on the end of the first
defsyntax if, if you like. Remember noise with no "argument" means the
name of the syntax form; noise("if") is not the same if you copy the
binding of if to another name, perhaps using modules.
The usual multiplication and subtraction operators could be defined this
way:
defoperator * (infix: 10)
defoperator - (prefix: 100, infix: 10)
Semicolon could be defined this way, allowing "blocks" to be written as
in C except using () instead of {}:
defoperator ; (infix: 0, repeat(expression noise(";"))
optional(expression))
Parentheses could be defined this way (!):
defoperator ( (prefix: 1000, noise expression noise(")"),
infix: 900, noise
repeat(or(keyword expression, expression) ",")
optional(or(keyword expression, expression))
noise(")"))
defoperator ) (error("Unbalanced parentheses"))
This says that left parenthesis as a prefix operator is followed by an
expression and a right parenthesis, and turns into just the expression,
i.e. the usual use of parentheses for grouping. Left parenthesis as an
infix operator is followed by a comma-separated argument list, with the
option of Dylan-style keyword arguments, and turns into the left operand
followed by the arguments, i.e. a function call S-expression. Its
precedence of 900 should be higher than anything but dot.
Of course the following really has to be written in S-expression syntax
for bootstrapping reasons but it makes a good example of complex syntax:
defsyntax defsyntax
(name
noise("(")
recursive(grammar, repeat(
or("token", "literal", "string", "keyword",
"name", "word", "expression",
string,
"repeat" noise("(") grammar noise(")"),
"optional" noise("(") grammar noise(")"),
"or" noise("(") grammar repeat(noise(",") grammar) noise(")"),
"noise" noise("(") grammar noise(")"),
"recursive" noise("(") name noise(",") grammar noise(")"),
"error" noise("(") string noise(")"))))
noise(")"))
I think the implementation of all this should be very straightforward
although I haven't tried it myself. It should give users of the language
great flexibility to define their own sublanguages and also has the nice
property of allowing almost all of the language to be explained in terms
of itself. The reason expression is hard-wired is because there would be
great apparent complexity and little gain from exposing to users the
machinery needed to convert expression grammar into a nonambiguous
grammar and to deal with operator precedence.
*** Dave Moon: Arc modules
Here is what I think about modules, in somewhat disorganized form.
David A. Moon wrote (on 1/21/2002 12:00 PM):
>The language comes with a bunch of top-level bindings that define the
>standard language, but users can add more and can replace the standard
>ones to customize the language. Top-level bindings should be organized
>into modules, but that's a separate topic.
A module is a map from symbols to top-level bindings. Package and
namespace are other names that have been used for the same concept.
All processing of code is done with respect to a current module.
Bindings in that module control how code is parsed, macro-expanded,
interpreted, compiled, and code-walked. Printing of code is also done
with respect to a current module, if S-expressions are to be converted
back into surface syntax and variable references inserted into macros are
to be mapped into expressions that would access the same variable from
the current module.
Common Lisp's approach of modularizing the map from names to symbol
objects instead of the map from symbol objects to bindings is wrong. It
comes from a 1974 historic context before lexical scoping. C++'s and
Java's conflation of modules with classes is wrong. So is C's conflation
of modules with files. I don't think Arc needs any
private/protected/public/static/extern type of concept.
Does Arc need modules? I think so. Modules aren't just for armies of
programming morons. Even solo programmers need a way to avoid name
conflicts, and if Arc were to become an open-source type of success it
would certainly need the ability to load libraries written by many
different authors and thus would need a way to avoid name conflicts. A
C-like convention with prefixes on symbols, all my symbols are named
moon-xxx and so forth, would almost be workable, but modules are more
flexible and produce much more readable code.
Besides its set of bindings, each module has an export list, which is a
list of symbols, and an import list, which is a list of other modules.
The visible bindings in a module include not only the ones in the
module's own set of bindings, but also the bindings exported by each
module from which this module imports. This feature is just an
abbreviation, but a very, very convenient one. I don't care whether
Dylan's features for selective import and for renaming during import are
adopted, but something has to be done to address name conflicts when
importing. It could be as simple as taking the one that appears earliest
in the import list.
When a new module is created, it initially has no bindings of its own,
but imports from a standard set of modules including the module that
exports the features of the Arc language. The module can then be further
modified by executing expressions that call module-manipulation functions.
The value of a binding can be a module. As with macros, this is only
useful if the value is known at compile time. This allows one to form a
heterarchy of modules and follow a path of bindings whose values are
modules to traverse that heterarchy. It's probably useful to include a
module named com in the default set of bindings visible to a new module;
this would allow copying the Java hierarchical naming convention for
packages as Arc's naming convention for modules.
When a binding is not exported from its module, the binding is still
accessible from other modules, you just have to specify the containing
module explicitly, perhaps starting from com. If your surface syntax has
x.y mean (x 'y), and calling a module with a symbol as argument returns
the value of that symbol's binding in that module, then you can use
Java's notation and say that com.paulgraham.arc.core.cons(1,2) returns (1
. 2) even if cons is not accessible in the current module. This assumes
the infix dot operator has higher precedence than the infix parentheses
operator and is left-associative so the above expression means (((((com
'paulgraham) 'arc) 'core) 'cons) 1 2).
>A top-level binding consists of one or more of the following: a value,
>which is either an ordinary runtime value or a macro; a type declaration
>if you allow those; a constant declaration if you allow those; a dynamic
>binding declaration (special in Common Lisp) if you allow those; a syntax
>declaration.
I suspect that a top-level binding should also include a setter value,
which like the regular value can be either an ordinary runtime value or a
macro. Then if the assignment operator's left-hand side refers to a
binding, it sets the value of that binding, but if the left-hand side is
a call expression whose callee refers to a binding, it calls the setter
value of the callee with the arguments from the left-hand side and the
value of the right-hand side. There is a special form named setter which
accesses the setter value of a binding, and is itself settable. This
seems better than doing string-concatenation on identifiers.
Of course a binding whose value is a module will have a setter value so a
path like com.paulgraham.arc.core.cons can be used on the left-hand side,
thus
com.paulgraham.arc.core.cons = fn(x,y) x + y
means
((setter (((com 'paulgraham) 'arc) 'core)) 'cons (fn (x y) (+ x y)))
Of course actually executing that expression would really mess things up!
*** Todd Gillespie: Multi-User
I was reading 'Being Popular', and the following line arrested my
attention:
"It could be an even bigger win to have core language support for
server-based applications. For example, explicit support for programs
with multiple users, or data ownership at the level of type tags."
The more I consider the implications of such an approach, the more I love
this idea. Multi-user support as a basic part of a language, carried
around in the type system. Over the weekend I worked over some rough
designs, and I envision a simple version of such like this:
1. All variables have a user id.
2. Most functions operate in a context that has been silently filtered
of any values they cannot access. Ideally, in this context programs
are written that cannot make incorrect permissions choices. In LISP
style, these restrictions are written in LISP and are redefinable.
3. Access to shared variables is handled by the runtime via a MVCC
locking mechanism.
4. Macros & fxns operating in non-user or root context can access all the
data, to handle the inter-user cases of ownership transitions,
modifications, application specific locking, complex sharing patterns,
and so forth. This strongly enforces a layering in programs, where any
dangerous inter-user code is all found in this one small place.
5. The above is made more complex in that some users administrate groups
of other users, so their runtime should scope data to their group, and
tag their personal ownership vs. group. Possibly this is simplified by
unifying users & groups (wherein any user may be a tree of permissions)
I did some research, looking for any languages that have similar features
already. I have found none so far; any attention paid to the basic idea
is devoted to instantiation ("isolate users with threads") or connection
("use an object broker"). So I don't have a code comparison; in my mind
code looks similar, except the new language would lack mutexes and
permissions checking in 99% of the code.
As an aside, having MVCC (wherein writers never block readers and
vice-versa) in the language runtime could be a major boon to both speed
and simplicity: it is perhaps a primary reason that so many DB-backed
multi-user web applications have been successfully built by people who
have never heard the term 'canonical resource ordering'. There are times
when you need a more complex or ruthless lock, but 99% of the time MVCC is
what you want. Using MVCC in the background could be very resource
intensive, but as it assures that there is no blocking on reads, the total
use on the wall clock is much lower, possibly allowing more users per
processor (a key metric for a language focused on server apps). Also,
since lock contention on multiple intersecting users performing a variety
of actions is still one of the most difficult debugging problems around,
this could be an extremely powerful feature for simplicity.
I discussed this idea with friends, most of whom didn't see the gain in
making a language over making a library in Java or its ilk, except for my
claims of terseness and the need to alter the runtime. I *was* made to
see that there is nothing in my design that could not be generalized to
other uses, but I think having an explicit pronoun for the user construct
makes the whole design much more immediate and valuable.
*** Will Duquette: Tcl
I was browsing your site, and happened to read the article on the
history of T. In that article, the author mentions an early
Scheme program called "markup". markup took as input files containing
plain text with Scheme code embedded in curly braces, executed the
Scheme code, and (presumably) wrote the result to the output along
with the plain text.
I was intrigued by this, because I'm the author of a similar program
based on Tcl rather than Scheme (see http://www.wjduquette.com/expand)
if you're at all interested. And when I was experimenting with Common
Lisp recently I considered writing such a thing in Common Lisp.
Eventually I rejected the idea, and the reason was a nifty if
obscure feature of Tcl that I didn't find in CL (or perhaps I just
missed it). You might want to provide in Arc.
Tcl has an interactive shell. If you type something that's obviously
a partial command, it displays a secondary prompt and allows you to
keep typing for as many lines as you need. When it detects that you've
entered a full command, it executes it. In Lisp terms, this means that
Tcl can take an arbitrary string and determine whether it is a complete
S-expression or not, taking into account all the nasty cases, such
as mismatched parentheses in embedded text strings.
Now obviously a Lisp shell can do the same thing. Where Tcl possibly
differs is that there is a standard Tcl command, "info complete". If
variable "mightBeComplete" contains some text, I can write code like this:
if {[info complete $mightBeComplete]} {
# Execute the code in mightBeComplete
}
Now, consider the markup program again. It's going to read text from the
file, looking for the first (non-escaped?) left curly bracket. Then it's
going to look for the matching right curly bracket, and execute the code
in between. But unless it forbids right curly brackets to appear in
the embedded Scheme code, it can't just scan for the next right curly
bracket; it has to find the first right curly bracket that follows
a complete S-expression (or perhaps a sequence of them). And that
essentially implies parsing the embedded Scheme code.
In Tcl, I don't need to do that. I search for the next close brace,
and ask Tcl whether the intervening text is a complete command. If it
is, I execute it; otherwise, I step onward to the next close brace
and so forth.
You're targetting Arc to the server side of things, and this kind of
easy code embedding might be very useful.
I also note that Tcl has a "subst" command, which lets you explicitly
ask for variable and command interpolation into a string. My "expand"
tool could have been implemented so that it simply read in the entire
file of text and let "subst" do its thing. In this case, I wouldn't
scan the file at all. But "subst" does its job all at once; it develops
that at times it's useful to process the input sequentially. For
example, consider this input text:
Some text [capitalize]with embedded markup[/capitalize].
Because "expand" processes the input sequentially, I can make the
"capitalize" tag begin to capture the input text. Then, the
"/capitalize" tag can retrieve the input received since
"capitalize" and convert it all to upper case. Tcl's "subst"
command doesn't allow for that kind of manipulation.
*** Rene De Visser: CL Lacks
1) Immutabile vs mutible
In my lisp programs I find myself often deciding whether a particular data
structure, will for the purpose of the program be:
Immutable:
In this case the data structure will be shared through all parts of the
program, and if any part of the program wants a changed version of the
datastructure it must copy it, creating a modified new data structure (so as
not to cause wrong side-effects in the rest of the program).
Mutable:
The data structure represents a single thing that changes. It should not be
copied, but changed directly.
In common lisp data structures are not tagged as mutable or immutable so the
above can not happen automatically. One must be extremely carefull to track
which data structure falls into which category. A single error can lead to
subtle problems in another part of the program.
It also leads to the fact that most operaters in common lisp there are two
versions. One which copies and one which modifies destructively.
Also Immutable / Mutable is related to equality and so without data being
tagged as mutable or immutable it is not possible to have a correct equality
check (resulting in multiple equality operations in CL which still have
problems) and no deep copy.
2) Data structure representation.
Using STL in C++ one can change the container representation used, for
example from unsorted list, to sorted list, to red/black tree without
changing any code in the rest of the program.
In Lisp, perhaps you are using lists to represent sets, however some of
these sets are small, an so an unsort list is best, others need to be used
in a sort manner and others are very large and need to be kept in a
red/black tree as sets with 1000000 items don't behave very well otherwise.
in Lisp each set (list) needs to be handle with different client code. Again
any mistake (for example resulting from change in representation, or mixing
up representations) results in subtle errors.
Basically lisp needs a 'SET' data type in order to separate the
representaton of a SET from the use of it.
More broadly this could be interpreted that LISP needs more abstract
functions (as in the STL) which don't care about the underlying data
abstract.
This is to a degree supported by the sequence concept in CL, but in practice
this can't even cope with the SET problem above.
*** J Storrs Hall: Semantic Incoherency
I consider one of the worst semantic incoherencies of
Common Lisp the incompatibility between "functions" defined
as macros and the "functional programming" features. Both are
higher-order concepts but each seems to be designed without
any consideration of the other.
*** Michael Vanier: LFSPs
My Ph.D. research involved writing very complex simulations of nervous
systems (real "neural networks", not the sorta-AI kind). I used a
simulation package that was written in C and had its own (horrible,
ghastly) scripting language, all written in-house. I extended the hell out
of it, but the experience was so painful I don't think I can ever work on a
large C project again. Since I want to continue working in this field, and
since I love to hack, I want to re-write the simulator "the right way".
However, I've been dithering on the choice of language. It's pretty clear
that the core simulation objects have to be written in C++. C is too
painful, and anything else is going to give an unacceptable hit in speed
(simulation is one of those rare fields where it is impossible to have too
much speed). But this is probably less than 50% of the code, maybe much
less. The rest is infrastructure; scripting interface as well as a lot of
support code. For scripting I want to use scheme or some lisp dialect.
But the language choice for infrastructure is unclear. I could use C++,
but that's unnecessarily painful especially since the infrastructure is not
speed-critical. So I'd decided to use java; it's fast enough, there are a
lot of libraries, and a lot of people know it so I could conceivably get
others to work on it as well. After making this decision, my interest
waned and I started another (unrelated) project.
In the process of working on that other project (which involves scheme and
Objective Caml (ocaml), an ML dialect), it occurred to me that ocaml would
be a better choice than java for the intermediate layer. It's faster, has
better type-checking, is much more powerful, and can even be used as its
own scripting language because of the type inference and interactive REPL.
If necessary, I could write a simple lisp-like language on top of ocaml
with little difficulty. The C interface to ocaml is also quite mature, and
there is a good-sized standard library (though nothing like the enormous
java libraries). Also, it's much lighter weight than java. But here is
the most important reason: it's a hell of a lot more fun to program in than
java. Writing java code, though not particulary painful in the sense that
C is painful (core dumps etc.), puts me to sleep. Writing ocaml (which is
a "language designed for smart people" if there ever was one) is exciting.
My motivation to tackle the project has tripled overnight. The interesting
question is: why is ocaml so much more fun than java? Why are "languages
designed for smart people" (LFSPs) so much more fun to program in than
"languages designed for the masses" (LFMs)?
One possibility is that LFSPs tend to be more unusual, and hence are more
novel. I'll admit that this is part of the answer, but it misses the main
point. *Any* new language is going to be novel, but the novelty usually
wears off quickly. The real point is that LFSPs have a much greater
support for abstraction, and in particular for defining your *own*
abstractions, than LFMs. This is not accidental; LFMs *deliberately*
restrict the abstractive power of the language, because of the feeling that
users "can't handle" that much power. This means that there is a glass
ceiling of abstraction; your designs can only get this abstract and no
more. This is reassuring to Joe Average, because he knows that he isn't
going to see any code he can't understand. It is reassuring to Joe Boss,
because he knows that he can always fire you and hire another programmer to
maintain and extend your code. But it is incredibly frustrating to Joe
Wizard Hacker, because he knows that his design can be made N times more
general and more abstract but the language forces him to do the Same Old
Thing again and again. This grinds you down after a while; if I had a
nickel for every time I've written "for (i = 0; i < N; i++)" in C I'd be a
millionaire. I've known several programmers who after only a few years of
hardcore hacking get burned out to the point where they say they never want
to code again. This is really tragic, and I think part of it is that
they're using LFMs when they should be using LFSPs. I note with some
interest that the original Smalltalk designers are *still* writing code in
Smalltalk; the Squeak project was founded by them and is mainly maintained
by them. I'm not sure if Smalltalk was intended to be a LFSP (actually,
I'm pretty sure it wasn't), but it does have good abstractive power and
does scare off newcomers, so I guess it is one :-)
So the bottom line is: computer languages designed for smart people don't
just liberate the language designer, they liberate the programmer as well.
*** Peter Norvig: Syntax
Some comments:
* car and cdr are warts; why not hd and tl?
* If Unix won, then the symbol red should be red, not RED
* For the Scheme I wrote for Junglee's next-generation wrapper language, I
allowed three abbreviations: (1) If a non-alphanumeric symbol appeared in
the first or second element of a list to be evaled, then the list is in
infix form. So (x = 3 + 4) is (= x (+ 3 4)), and
(- x * y) is (* (- x) y). And (2), if a symbol is delimited by a "(", then
it moves inside the list. So f(a b) is (f a b), while f (a b) is two
s-exps. And (3), commas are removed from the list after infix parsing is
done, but serve as barriers to infix beforehand, so f(a, b) is (f a b),
while in f(-a, b + c), each of -a and b + c gets infix-parsed separatly, and
then they get put together as (f (- a) (+ b c)). This seemed to satisfy the
infix advocates (and annoy some of the Scheme purists). You might consider
something like this.
* Non-chars in strings doesn't make sense to me; I always think of strings
as arrays of type char; the tricky issues are whether strings are mutable,
and if so, extensible, and if they're mutable, which operations are
efficient (i.e. should removing the first char be O(1) or O(n)). The other
issue is which library functions should only work on strings, and which work
on arrays or sequences. Finally, you need an extensible protocol for
sequences (like Dylan, and in-the-works in Python).
* If you have (do (= x val) ...) then you don't need let and with.
* My inclination would be to have ds, opt and get be non-alphabetics, e.g.
(def f (a (: (b c)) (? d 4) (& e f)) ...)
This way its easier to see what's a param and what isn't.
*** Peter Norvig: Another Onion
I think conflating pairs and lists is an onion that accounts for several
pages worth of equivocation in ANSI CL. Both pairs (lightweight structures
that can hold two components) and lists (hold arbitrary number of components
with O(n) acces time) are useful ideas, but Lisp uses a cons cell for both
for two reasons: (1) at the time, data abstraction was not in vogue, and (2)
if you've only got 2 bits of tags, you want to be parsimonious with your
basic types. But conflating the two means that every function that takes a
list must specify what it does when passed an improper list. If it were my
language, (cons x y) would raise an exception if y is not a list, and I'd
consider whether to separately have (pair x y), which would be accessed with
first and second, not first and rest, or whether to just use (array x y).
If you do have pair as a subclass of array, you might also want to consider
triplet.
As for (+ "foo" "bar"), I'm somewhat against it, because it means (a) + is
no longer commutative, and (b), no longer associative: compare (+ (+ 2 3)
"four") and (+ 2 (+ 3 "four")). I've actually seen errors of the second
kind in Java programs: sys.out.println("answer = " + x + y) instead of
("answer =" + (x + y)). On the other hand, I admit, after using Python,
that "foo" + "bar" feels natural, to the extent that I wasted 20 minutes
debugging a Perl program where I used + instead of . for string
concatenation. Whatever you decide about the operator (I'm thinking I'd
rather write "foo" & "bar"), I recommend NOT having an automatic coercion
from object to string; I think Python is right in raising an error for this.
In Perl its even worse, because my strings were coerced to numbers (0).
*** Michael Vanier: OO
I think one thing that lisp really needs that C and C++ have is the ability
to lay out data types in a very low-level fashion. This is not something
that should be done lightly and perhaps not at all unless you are a very
experienced programmer, but it's critical to getting really good
performance in many cases. I suspect the reason why lisps don't have it is
that it can interact in hairy ways with the garbage collection scheme,
which makes life much harder than it is in C/C++. There are also issues of
array bounds non-checking etc. and then before you know it all safety bets
are off.
This leads naturally to the idea of having a "low-level lisp" in which such
programs can be written and then used safely by the standard lisp. I
believe this is how prescheme/scheme48 worked (works?), although this was
done just to bootstrap the system AFAICT. There is also an analogy to C#
and Modula-3 (garbage collected languages that allow for "unsafe modules").
I would really like it if Arc offered some facility like this, but I don't
understand the issues well enough to know if it's feasible or not.
*** Matthias Felleisen: OO
Paul, you don't have to go to bunches of bytes and do things atop.
In PLT Scheme v200 everything is a struct, yet these values won't respond
with #t to struct? because we don't want anyone to think of these things as
structs.
Also, we are currently designing extensions so that programmers can define
functions and ask the environment to treat them as primitives, especially
with respect to soft typing. So you will say something like
(: f (number -> number))
(define (f x) ... (+ x ...) ...)
and f is treated as a new "primitive". If the soft typer sees that you
misapply it somewhere, it will be hi-lited in red. It's up to you to think
of this as a typer and you will refrain from running this code, or you may
decide that the type system is too conservative and you run the code
anyway, knowing that we enforce the types anyway.
The key is not just to write the types down. You need to analyze and
enforce them. And in "Lisp" (or Perl or Python or Curl) the question is how
to scale this to higher-order types.
Even more general: we have known fro 50 years how to build a language atop
bunches of bytes. PL design will make progress if we do the exact same
thing but without ever thinking what is inside the machine. Computations
are about value manipulations. At the moment, we represent values in bits
and bytes, but why think about those. It just pollutes program design.
*** Mike Thomas: Squeak
- I think that an interesting way to test language ideas is the development
model used for Squeak (Smalltalk dialect) - anyone, even raw beginners, can
add anything they like and stuff which is popular survives simply because it
is used. (Survival of the fittest.)
- I like the work being done on region based memory management eg: MLKit at
the University of Copenhagen, and a C dialect from Lucent technologies (name
escapes me).
- Coherent FFI from the beginning and an IDL compiler targeted at ARC to
allow ease of interfacing to OS and third party libraries. Glasgow Haskell
compiler has successfully used this strategy.
- incorporate type inferencing and optional type hints.
I make these comments from the point of view of a person who leans to the
Haskell/SML side of programming with fond memories of Scheme (apart from the
fact that my bread and butter is obtained from the C family), with a
background in mathematical, game, GIS and large oil industry technical
software development.
*** Jecel Assumpcao: Objects
There is a lot of good advice in your website but you don't seem to be
following it regarding objects: don't put in stuff for others to use or
to be politically correct. If you don't like them yourself, just leave
them out.
My own languages have always been OO and I think objects really help in
any non trivial program, but each person has his own preferences. There
are several different styles of object systems and I think that the
kind in E or Beta might be a better fit for Arc and might be more
useful to you.
Let's do the classical cartesian point example. My first try will avoid
adding syntax at the cost of making most expressions look infix or even
postfix. A list (c x y z), where c is a closure, has the same effect as
evaluating (x y z) in c's context. And make-closure returns the value
of the current execution context.
(define point (x y)
((define + (p) (point (x + (p x)) (y + (p y))))
(define pr () (("point " pr) (x pr) (" " pr) (y pr)))
(make-closure)
))
(((point 3 4) + (point 5 6)) pr)
The reason why the "+" ended up as an infix operator was that it had to
be preceded by a context (object) so that the normal "+" wouldn't be
invoked. Note that strings and numbers also were treated as closures in
the code above.
This is not very Lisp-like, so I will add the following syntax element:
a ":" will indicate that the element following it is the context in
which the surrounding expression should be evaluated:
(define point (x y)
((define + (self p) (point (+ :x (:p x)) (+ :y (:p y))))
(define pr (self) ((pr :"point ") (pr :x) (pr :" ") (pr :y)))
(make-closure)
))
(pr :(+ :(point 3 4) (point 5 6)))
Well, it is still ugly but at least it is starting to look like Lisp.
The idea is to use a few general constructs instead of a lot of
specialized ones. For example, using functions as "classes" has the
nice side effect of us not having to do anything extra to get a single
inheritance-like functionality - lexical scoping will do just fine.
With the above notation, you don't have to dispatch on the first
argument (your cons complaint). The "+" would have to be rewritten in a
more symetrical form before we could have (pr :(+ (point 3 4) :(point 5
6))) however:
(define + (a b) (point (+ :(:a x) (:b x)) (+ :(:a y) (:b y))))
All those (:b x) are very ugly and should be simplified along the lines
of your ("hello" 2). This isn't really a solution, but just a general
direction that I feel could be followed.
If objects are convenient enough, I don't think you will the the
database types.
While "fn" is cleaner than "lambda", I would rather use "?" since it
isn't harder to type and stands out better in the code. It is also
mnemonic for "mystery function" ;-)
*** John McClave: End Test
I am interested in the 'given'
overhead of the pervasive end loop test and its
possible elimination. First a hardware analogy.
Is it possible to use a hardware settable interrupt
to replace the end of loop test and just use
unconditional jumps to keep the loop runnable.
In Lisp-like coding where end of list tests are part
of most recursive function definitions is it possible
to preclude this overhead by some form of latent
daemon tagged onto the list structure and some form of
unconditional jump. Could a closure of some sort be
triggered to end the function process.
*** Avi Bryant: Continuations
So here's a cool code example for your consideration, very slightly
abbreviated:
-----------------------------------
Object subclass: #Continuation
instanceVariableNames: 'stack '
classVariableNames: ''
poolDictionaries: ''
category: 'Kernel-Methods'!
!Continuation methodsFor: 'invocation'!
value: v
thisContext sender: stack copyStack.
^v !!
!Continuation class methodsFor: 'instance creation'!
fromContext: aStack
^super new stack: aStack copyStack !!
!BlockContext methodsFor: 'continuations'!
callCC
^self value: (Continuation fromContext: thisContext sender)! !
-------------------------------------
Of course, it's easy to provide an equivalent example in Scheme - call/cc
already exists. But it's completely impossible, as far as I know, to
provide an equivalent example in CL, even a really ugly one. I bring it
up partly because I know you realize the usefullness of continuations for
web development, and yet I haven't seen any mention (did I miss it?) of
continuations in Arc. And I know from personal experience that a
closure-based continuation-passing style, as it sounds like you adopted
for ViaWeb, is useful but far more frustrating and less transparent than
true continuations. So, at the very least, let me request that full
continuations be included in Arc.
But I also bring it up because, to my surprise, I have found myself using
Smalltalk rather than Lisp for my web development - although I deplore the
lack of macros, in cases like continuations I have found smalltalk *more*
powerful than lisp. Say what you like about "everything is an object",
the fact that stack frames (and closures) are reified makes certain games
much easier. Here's another code snippet:
snapshot := self copy.
context := thisContext sender.
[context isNil] whileFalse:
[((context isKindOf: MethodContext) and: [context receiver = self])
ifTrue: [context receiver: snapshot].
context := context sender].
Since using a functional style is all but impossible in smalltalk, this
offers an interesting alternative - switching midmethod to a copy of the
current object. Invoked right after a call/cc, it ensures that each
invocation of that continuation occurs in a separate context of sorts,
which is very useful in a browser-back-button world (yes, it's also
dangerous and limited, but it's very pragmatic if you know what you're
doing - the actual version I use also has various sanity checks that only
make sense in the context of my framework).
So will we ever see this kind of reflective power in a lisp? Or am I
going to have to write a prefix-syntax parser for smalltalk to have my
cake and eat it too?
*** Seth Gordon: Syntax
I have a theory that one reason people get techy about all of Lisp's
parentheses is that people's brains have trouble parsing deeply
recursive sentences, so when they see a piece of code with deeply
nested structures, it looks very intimidating.
Therefore, I suggest creating an infix operator ~, defined as:
(f ~ g x) ==> (f (g x))
(f ~ g x ~ h y) ==> (f (g x (h y)))
To pick a piece of Scheme code that comes to hand: this operator would
let one represent
(define serve
; use default values from configuration.ss by default
(opt-lambda ([port port]
[virtual-hosts virtual-hosts]
[max-waiting max-waiting])
(let ([custodian (make-custodian)])
(parameterize ([current-custodian custodian])
(let ([listener (tcp-listen port max-waiting)])
; If tcp-listen fails, the exception will be raised in the caller's thread.
(thread (lambda ()
(server-loop custodian listener
(make-config virtual-hosts (make-hash-table)
(make-hash-table)
(make-hash-table)))))))
(lambda () (custodian-shutdown-all custodian)))))
as
(define serve
; use default values from configuration.ss by default
~ opt-lambda ((port port)
(virtual-hosts virtual-hosts)
(max-waiting max-waiting))
~ let ((custodian (make-custodian)))
(parameterize ([current-custodian custodian])
~ let ([listener (tcp-listen port max-waiting)])
; If tcp-listen fails, the exception will be raised in the caller's thread.
~ thread ~ lambda ()
~ server-loop custodian listener
~ make-config virtual-hosts
(make-hash-table)
(make-hash-table)
(make-hash-table))
(lambda () ~ custodian-shutdown-all custodian))
*** Trevor Blackwell: Modularity
It's worth distinguishing a third category of programming
environments: one where a solitary hacker uses many open-source
modules written by various folks. I consider this a very good
environment, unlike pack programming. But it has some requirements in
common with it. Various languages succeed or fail dramatically in
making this work smoothly.
In C/C++, I find I can almost never use other people's stuff
conveniently. Most have some annoying requirements for memory
management, or use different string types (char *, STL string, GNU
String, custom string class), or array/hash types, or streams (FILE *,
iostream) or have nasty portability typedefs (int32, Float).
CL has fewer such troubles since it has standard memory management &
string/hash/array types, but there are often nasty namespace
collisions or load order dependencies, especially with macros.
Most chunks of CL code I've seen (which are mostly PG's) won't work
without special libraries, which have short names and are likely to
conflict with other macro libraries or even different versions of the
same macro libraries.
Perl packages work pretty well, because everyone agrees on how basic
types like string, stream, array and hash should work, and the normal
use of packages avoids any namespace collisions. I don't think I ever
had a open source Perl module break something. I guess Java also
prevents conflicts, but you have to give up an awful lot to get it.
The huge assortment of open source Perl packages testifies to the ease
of writing them. I've taken a few packages that I wrote for my own
purpose and found it very easy to make them self-contained and
contribute them. Usually, people only write C++ libraries for very
large projects, and it requires a different programming style from
what you'd use for your own code.
Anyway, I suggest that Arc's modularity features should be designed to
support use of open-source modules in single-hacker projects, not to
support pack programming. This suggests, in addition to what CL
already has:
- that modularity be a convention (CL, Perl), not something enforced
by the compiler (Java). Sadly, I find the CL module system way to
cumbersome to actually use. It has to be convenient enough to use in
ordinary programming, not a special thing you use when you're writing
a module for external use.
- there be a sufficient basic library that everyone won't have to
write their own basic library. I'm talking about the sort of functions
from On Lisp like last1, in, single, append1, conc1, mklist,
flatten. If everyone has their own, then you have to package it up
with any code you publish (making it awkward to publish) and it'll be
hard to read other people's code with different definitions of basic
functions.
*** Mike Vanier: MATLAB
Point 1: Matrix manipulation is a canonical example of something that
deserves its own dialect. DO NOT build this in to the core.
Point 2: Having matrix functions like eig() for eigenvalues is trivial and
can be done in any language if the library exists. So that's not
a big deal.
Point 3: Syntactically, you need this:
a) A nice literal data entry notation for arrays. Matlab's notation is
NOT nice because it doesn't scale for 3D+ arrays. What would be
better is e.g.
my_array = [[[1 2] [3 4]] [[5 6] [7 8]]]
This scales nicely. Hey, S-expressions again! :-)
b) Functions that operate by default on arrays as opposed to on scalars
(or really on both). So, sin(my_array) will compute the sin of all
the elements.
c) Various functions for manipulating and transforming array shape. In
2D arrays, you have resize() and transpose (written ' in matlab), but
these can be generalized.
d) [Hard!] Higher-order functions to specify the "rank" of a function
call. In simple terms, this refers to which axes of the array are
used in the computation. For instance, the sum() function applied to
a 2D array might mean "return the column sums" or "return the row
sums".
The only language that gets this right that I know of is J
(www.jsoftware.com). J is beautifully designed, but I don't like the
line-noise syntax. If Arc had a dialect that could handle this material as
elegantly, I would use it. Also, J is closed-source (but free), yadda
yadda (the Rebol/Curl syndrome).
Look at the J documentation; it's all available for download. Quite
mind-stretching. Aside from scalar data types (characters and several
types of numbers) the only data types are multidimensional arrays which
hold a uniform data type and "boxed arrays" (which are arrays that can hold
arbitrary data types, including other arrays). Very neat stuff.
*** Neil Conway: Ruby
1) Blocks: 90% of the benefit of functional programming, no weird
academic ML-type stuff ;-)
ary = [1,2,3]
ary.each do |element|
puts "Item: #{element}"
end
2) Everything is an object:
5.times do
puts "hello, world"
end
3) Dynamic class definitions and singleton method definitions:
class Foo
def bar
"bar"
end
end
f = Foo.new
f.bar
=> "bar"
# add more definitions to Foo
class Foo
def baz
"baz"
end
end
f.baz
=> "baz"
def f.another_method
"Singleton"
end
f.another_method
=> "Singleton"
Foo.new.another_method
=> error, no such method
(you can also add definitions to built-in classes like String and Array
like this -- making it a convenient place to store utility functions)
4) Built-in design patterns: (or rather, a mixin implementation that is
flexible enough to support this kinds of patterns very naturally)
require 'singleton'
class FooS
include Singleton #mixin the Singleton module
def initialize
puts "FooS init"
end
def bar
"bar"
end
end
f = FooS.new
=> error, 'new' is private
f1 = FooS.instance
# "FooS init" printed to screen
f2 = FooS.instance
f1.inspect
=> #
f2.inspect
=> #
If you plan to draw any language features from Python for the design of
Arc, I'd also suggest looking carefully at Ruby: IMHO, it offers all of
Python's features with a more elegant design, as well as a number of
very interesting features from languages like Smalltalk.
*** Olin Shivers: Regexps
If you provide regexps, then you should feel a moral obligation
to also provide context-free grammars, i.e., a parser tool, as
well. It's a big problem that languages that provide cheap, easy
regexp matching encourage programmers to use regexps to analyse
things that are *not* regular languages. Oops. So you get heuristic
code that works... most of the time. Perl is the classic example.
If you steal Manuel Serrano's little built-in, lightweight grammar
language, then people can use regexps to recognise regular languages
and CFGs to recognise more complex things.
*** Pinku Surana: Brevity
- Check out Todd Proebsting's TR on "Programming Shorthands". It's about
"pronouns" in PL syntax.
http://www.research.microsoft.com/~toddpro/
- Syntactically, Haskell bested ML's "fn" for lambda:
Haskell: \x -> x + 1
ML: fn x => x + 1
Scheme: (lambda (x) (+ x 1))
- Haskell treats strings as lists of characters. You might find their
experience interesting.
- pattern matching syntax for lists is short & concise.
- List comprehensions (syntax within []'s below) can do powerful things with
brief syntax:
quicksort [] = []
quicksort (x:xs) = quicksort [y | y <- xs, y=x]
An equivalent Scheme program:
(define (quicksort lst)
(if (null? lst)
lst
(let ((x (car lst)) (xs (cdr lst)))
(append (quicksort (filter (lambda (y) (< y x)) xs)
(list x)
(quicksort (filter (lambda (y) (>= y x)) xs)))))
*** Kragen Sitaker: Comments on Arc
fn
----
Since setf plays such a prominent role in Arc, there should be a
version of fn that allows you to define a 'setter' function as well as
a getter.
Compound = Functions on Indices
-------------------------------
I wish every language used the same form for function call, hash
fetches, and array indexing, the way arc does. It's a brilliant idea.
In order to be able to write new compound data types in Arc, there
needs to be a way to create values that can be called as functions and
also implement other operations. This can be done in a
straightforward way with the overloading system described for objects
in "Arc at 3 Weeks": overload apply. (Apply should probably be called
"call", if we're rebuilding Lisp from the ground up.)
Pronouns
--------
'it' being bound by iteration and conditionals is also a brilliant
idea. Python is working toward getting something similar, but
nobody's done it yet. FORTH has sort of had it with DO and I and J
for a while, but not in conditionals. (One might argue that FORTH is
almost all pronouns; very little data is named.) And, of course, Perl
is full of pronouns: <>, $_, the current filehandle, etc.
Shouldn't there be a form of 'each' and 'to' that bind a pronoun, too,
as in Perl and FORTH?
(each '(a b c) (pr item))
(to 10 (= (ary i) (* i i)))
>From the examples, it looks like 'keep' gives a way to write a list
comprehension, and 'sum' gives a way to write foldl (+) (APL's +/)
over a list comprehension; except, of course, you can write "list
comprehensions" that iterate over non-list-like things. Some other
languages (Python, for example) are handling this by making almost
anything you can iterate over look like a list; list comprehensions
are nice, compact, and readable.
DBs
----
The proposed semantics for fetching from a DB are that nonexistent
keys return the current value of the global variable *fail*, which
defaults to nil. This is wrong, for two reasons:
- If mydb is a DB that doesn't have any false values in it, and nil is
false, this code will almost always work, but it's broken:
(if (mydb foo) (x it) (y))
because it assumes that *fail* is set to something in particular. A
correct variant of this code (assuming = can set a global variable) is
(do (= *fail* nil) (if (mydb foo) (x it) (y)))
To write correct code, you'll almost always have to set *fail* like
this before a test for existence, because you must ensure it's set to
something you know can't legitimately be in the DB.
Making correct code be much more verbose than incorrect code that
almost always works, but breaks when a completely unrelated part of
the program changes, is probably a bad idea. (I know Graham is
designing a language for good programmers, who will presumably take
the extra time to write the correct form, but I think that's going a
little too far.)
- there's a much weaker reason, which is that *fail* might actually be
the value of an item in a DB --- for example, a DB that contained the
program's global variables.
I like Python's approach to this problem. Fetching a nonexistent
value in the normal way will raise an exception, which is usually the
right thing. So you could write the above code as
(try (x (mydb foo)) 'KeyError (y))
But there is also an operation 'get', which returns a specified
default value if the requested key doesn't exist, and is *much*
briefer than the corresponding operation with *fail*; compare:
(do (= *fail* "") (mydb foo))
(get mydb foo "")
or even
(mydb foo "")
and an operation 'has_key', which allows the operation to be written as:
(if (has_key mydb foo) (x (mydb foo)) (y))
although this is nearly as verbose as the correct version with *fail*.
Another approach: The two-argument variant could actually be a macro
(since macros are first-class objects, your DB can be a macro) which
evaluates and returns its second argument only if the lookup fails.
This almost works to shorten the example above to (x (mydb foo (y))),
but that calls x even if the lookup fails.
A third approach: has_key is an instance of a wider concept of
'exists'; in Perl, 'exists' applies only to hash lookups, but it
applies to a wide variety of operations, essentially everything it
makes sense to call setf on.
(exists var)
(exists (sqrt -1))
(exists (mydb 'bob))
(exists (car list-or-nil))
Python has a fourth generic operation as well: deletion. You might
want to check to see whether (foo 1) exists, get its value, set its
value, or delete it, regardless of whether foo is a vector, an
associative array, or even possibly something else altogether.
A fourth approach: in Icon (and, sort of, in Prolog, and, in another
way, sort of, in Lisp), expressions can return any number of values,
including none; no value returned is boolean false. If a dictionary
lookup returns no values if it fails, then
(if (mydb foo) (x it) (y))
can be correct, because 'no value' is distinct from any particular
value, even nil, just as the empty string is distinct from any
particular character, even NUL.
Lisp's assoc family takes another approach: return something
containing the correct value, not the value itself. I don't like
this; although pattern-matching could make it less painful to use, I
don't think it could provide any better syntax than the 'try' approach
above.
The default of being indexed by 'eq' is not very good for string
processing, unless you use string->symbol a lot --- which might be a
good idea for string processing, but will certainly make your code
more verbose.
If you want the option of using arbitrary equality operators for DBs,
and you want some DBs to be hash tables, there needs to be a way of
associating a hash with an equality operator.
assignment
----------
The current proposal for the language has (= var form) either declare
a new lexical variable valid until the end of the current block, or
change the value of an existing variable.
I really prefer languages that have different forms for declaring a
local variable (with an initial value) and changing an existing
variable; this is one of Python's big deficiencies. I care for the
following three reasons:
- I can see immediately whether a piece of code uses assignment; I
avoid assignment in some parts of my code for testability
- the difference between modifying variables declared in a larger
scope and shadowing them with local variables with the same names is
obvious, and there's an obvious way to do both
- misspelled variable names get detected instead of silently producing
incorrect results (or occasional run-time errors)
I do like the way C++ lets you declare a variable in the scope "the
rest of the block", and I think the Scheme/Lisp/C way leads to
unreadably-indented code or variables being declared too far from
their use.
Syntax
------
There is another argument for having syntax, other than that it makes
programs shorter: it can make programs more readable (in the sense
that source code generally is more readable than bytecode, or
mathematical formulas are generally more readable than English
sentences describing the same formula). In other words, it makes the
language more accessible to anyone, good programmers included. I
don't think of decompiling bytecode as "dumbing down".
Unicode offers a plethora of punctuation; perhaps tasteful and
restrained use of some of this punctuation could make the syntax more
readable than any possible ASCII syntax. For example, I'd be inclined
to write infix bitwise Boolean operators with real Boolean operators
instead of ASCII stand-ins, and I'd be inclined to declare variables
with some conspicuous piece of punctuation. (In ASCII, I'd probably
prefer := if it didn't already have so many closely-related meanings
in other languages (Pascal, GNU make).)
Implicit progn
--------------
Graham is right on the mark here; eliminating implicit progn means we
can eliminate many of the uglier and more verbose pieces of Lisp
syntax.
(Implicit progn seems to have crept into the 'to' syntax, though; one
of Graham's examples is (to x 5 (sum x) (pr x)).)
Pattern matching
----------------
Destructuring arguments are very good. I'd like to have full ML-style
pattern matching, though; it's possible to write that as a macro, but
I'd like it to be part of the language.
Recursion on strings
--------------------
Recursing on strings can be very efficient if the strings are allowed
to share structure the way lists do and your compiler is smart enough
to reuse storage that can be statically proven garbage. (Or even if
it can allocate it on the stack.)
Canonical symbol case
---------------------
The "Arc at 3 Weeks" paper's examples show symbols being canonicalized
into upper case. I don't like this; upper-case is hard to read. I
assume it's an artifact of the first Arc implementation running in an
existing Common Lisp system.
Unicode makes canonical case impractical, anyway, because case-mapping
is very complicated.
Classes
-------
The proposed overloading semantics won't give an extensible numeric
system.
My tastes in object systems appear to be markedly different from
Graham's; mine are largely founded in experience with Python. So the
object system I want is probably not something he wants:
Classes, like compounds, should be called to instantiate objects; this
makes the calling code shorter, and also allows you to turn classes
into factory functions and vice versa without changing the calling
code.
There doesn't seem to be a provision for a constructor or methods
(other than overloads of existing functions). There probably should
be.
Class instances should not be callable unless they overload apply, for
the following reasons:
- there's no need to have one syntax for functions the objects
overload and another syntax for getting attributes of the objects.
After I say (= pt (class nil 'x 0 'y 0)) (= p1 (pt)), I should be
able to ask for (x pt). This does have the disadvantage that "class"
must somehow bind x and y in my calling environment if they aren't
already bound; I think that's easily accomplished with a macro.
(This also removes the necessity for quoting x and y.) (Hmm, what if
x is bound to nil? Should we then make nil
callable? Perhaps we should just raise an error if x is bound in the
local environment, and require that it be quoted or otherwise marked
when it's defined as a method in this manner. Quoted attributes
would define functions in your local environment which sent
doesNotUnderstand to the object passed as their argument, and then
overload those functions for this particular object type.)
- the current example (p1 'x) prevents p1 from being able to act like
some other kind of compound.
Most dynamic object systems let you get an object of a class you've
never heard of and invoke the correct methods on it. Doing this with
the object system described above requires that you have some variable
bound to the method "x" defined in "pt" above. I thought this was a
problem at first, but I don't think it is now; presumably if your code
was written without knowledge of that "x", it won't mention "x" (at
least, not meaning that method), and if it was written with knowledge
of that "x", it's presumably because there's an interface somewhere
that defines "x" to mean something. That interface can be defined in
some module somewhere that both "pt" and my calling code import, in a
form like this:
(defmethods 'x 'y)
This has the advantage, not shared by most dynamic object systems,
that we can have many methods named "x" without conflict, and we can
be sure that the "x" we're calling is intended to implement the
interface we expect it to, not some other interface that has a method
called "x".
Like others, I haven't yet found a compelling use for multiple
inheritance, and there are compelling reasons against it.
No doc strings
--------------
No elisp/Python-style documentation strings are mentioned. I want
these.
*** Miles Egan: Generators
The idea is to provide an abstract & extensible iteration interface for
collections. This makes it easy to write code that works with any collection
and to define new collection types that work with existing code. A brief
example:
class Series
def each
for i in 1..3
yield i
end
end
end
class Words
def each
for i in ["one", "two", "three"]
yield i
end
end
end
Series.new.each { |i| puts i * 2 }
Words.new.each { |i| puts i * 2 }
would output:
2
4
6
oneone
twotwo
threethree
I can define a new class that provides an "each" method that returns lines
from a file, rows from a database, or widgets from a gui and yield values to
the same code block. Alternatively, I can supply arbitrary code blocks to my
object's "each" function without coupling the code in the code blocks with the
collection's storage implementation. The crucial feature is the "yield"
keyword. I don't know of an equally abstract, general, and programmable
counterpart in CL.
*** Ed Cashin: Things I Want
Things that I was very pleased to hear:
* arc is UNIX aware!
-- languages that acknowledge UNIX are much easier to use for day
to day things as well as for big projects.
-- they're also easier to learn, since you know a lot of the
concepts already (e.g. a socket)
* arc knows that I don't want busy work :)
-- short but meaningful names
-- arc has fewer parentheses when there's no ambiguity
* syntax as abbreviation sounds good. I especially like the way
syntax can visually mimic data structures (see below).
* "it" pronoun is a nice feature
* recognition that "do" in common lisp is hard to read. I like the
loop constructs in arc so far, except I prefer "times" to "to". I
like ruby's syntax:
3.times { print "Ho" }
* "keep" and "sum" are very natural and handy looking.
Things that worried me:
* strings as lists may be nifty, but I want string operations to be
lightening fast at runtime.
the other could be a "string-list" or something.
* when I read that programmers might use indentation to show
structure, it worried me that arc might be whitespace dependent
like python.
I know a lot of good programmers who don't want anything to do
with python because it is whitespace dependent.
* is the backquote syntax an onion or the best compromise? It's a
little weird, but I can't think of anything better.
* the "with"/"let" thing ... more simple would be making let do what
you have with doing and then forget the one-variable form.
* will objects be able to have access controls (e.g. "private")? if
not, is that a problem?
Things that would be exceptionally nice (literally):
* forth-like assembler
* easy to use C libraries
* easy to make standalone binaries
(shared-lib dependency ok, but not external program dependency)
(go figure.)
* non-buggy, non-experimental native thread support
(def do-some-work ()
; ...
)
(push my-thread-list (thread-new do-some-work))
Maybe something like ...
(thread-lock 'balance)
(= balance (* interest balance))
(thread-unlock 'balance)
Ruby has nice threads support, but they are really juggled by a
single-threaded ruby process. Perl's thread support is still
experimental, according to the docs in the latest release.
* internationalization
-- this seems like a hairy thing, especially with LOCALE
environment variables running everywhere, but it might be
something to brag about
-- maybe unicode support or whatever is in vogue now
Things that would be nice (like ruby or perl):
* case-sensitive symbols
I'm not sure whether this is a win or not, but I was kind of
turned off when I first learned that lisp internally thinks of
symbols as uppercase ... it reminded me of crippled MS-DOS based
filesystems.
* syntax for associative data structures
makes for visibly obvious initialization. e.g. ruby:
keytab["UP"] = "^"
keytab = {
"LEFT" => "foo",
"RIGHT" => "bar",
"X" => {
"A" => "x a",
"B" => "x b",
"C" => "x c",
}
}
* excellent regular expression support
-- up-to-date fancy stuff like perl's /(\w+?)fancy!/
-- must run fast
* mixins from ruby are pretty neat
-- in C++ and Lisp, there's multiple inheritance; in Java and
Objective-C there's single inheritance plus a way to say "and I
can also do this and this"
in ruby there's a way to "mix in" code too
http://www.rubycentral.com/book/tut_modules.html
* no dependency on source file names or locations
in ruby and lisp namespace/module issues are not related to source
code files, which is very convenient, especially compared to the
way Java does public classes and package hierarchies based on
filename and file location.
* adding to the definition of a class after defining it
* a way to specify a multi-line section of text in the source
both perl and ruby have here documents, where interpolation can be
turned on and off. It's very readable and very convenient.
* a searchable, browsable web interface to a plethora of arc code
comparable to CPAN
* a way to do multiple assignment naturally and conveniently
e.g., ruby:
a, b = 14.divmod 4
Some ideas:
* Objective-C's method syntax
I don't have time to elaborate, but it strikes me as the most
readable, self-documenting syntax with the exception of keyword
parameters like in lisp.
* you could sell the way lisp (and so I'm guessing arc) encompasses
the OO model of classical OO languages while providing more.
you discuss that in _ANSI Common Lisp_, but I need to reread that
chapter, to be honest
*** Mikkel Fahnøe Jørgensen: Strings
You mention string handling will be important - I agree.
Please do not make the mistake of ignoring multiple charactersets.
You mention strings may contain non-printable charactes and even character
generatingn objects.
I think this is a fine concept. And it may go well with multiple
charactersets.
I had a discussion with Matz who designed the Ruby language. Currently Ruby
only supports 8bit characters with a certain level of UTF-8 support.
I argued that he should just make it Unicode.
But it turns out that a number of popular Asian scripts (language encodings)
do not fit well with Unicode.
Also, Unicode is not really fixed width, hence the 2-byte representation
would sooner or later proove to be incomplete.
Matz is currently working on internationalization but I believe his solution
would basically be to have a string as a character string with addition type
info that stores the encoding. This simple idea is very powerful once you
think about it.
Conversion between data from old applications to new Unicode charactersets
become much easier because the string carried around will tell you what it
is, allowing you to do the proper conversion.
String concatenation will be require a conversion of at least one of the
strings if the types doesn't match. This is like ordinary type coercion.
Handling strings as typed instances of an underlying binary stream is a good
solution.
Different representions can be viewed like different numerical
representations (float, double, integer).
In this sense the concept of a string as a list works well with multibyte
represenations where you cannot easily index a particular character without
traversing the entire string.
Originally I thought multibyte charactersets were outdated, but now I'm
convinced that we just have to be better at dealing with them.
*** Miles Egan: Bytecode
I agree that the performance limitations of bytecode are a big drawback,
although I don't think bytecode-compiled languages need necessarily be as slow
as Python or Java. Ocaml, for example, compiles to bytecode or native code and
Ocaml bytecode is often quite efficient. The big advantage of bytecode systems,
of course, is that you can transparently mix code written in different
languages. This could save you the trouble of writing all those boring but
essential support libraries - HTML and XML parsers, database access libraries,
image processing tools, network protocol libraries etc. I think that it's
really all the high-quality CPAN modules out there that keep Perl going at this
point.
I'm not any kind of expert on .NET. As I understand it, it is comprised of a
common bytecode runtime environment (the CLR), the C# programming language,
support libraries for system, network, and web services, and identification and
authentication protocols. Microsoft intends it to be the infrastructure for
their next-generation web services and the whole environment has been designed
with that in mind. Several prominent free software groups, most notably the
Ximian folks, have announced plans to build open-source implementations of .NET.
You can read about Ximian's plans here: http://www.go-mono.com/. I mention this
to you because it sounds to me like they're trying to solve some of the same
problems. They plan to implement their own runtime and C# compiler, but I think
a better language that could transparently access all the library code the
enviroment provides would be very appealing to a lot of open-source web hackers
and a lot easier to implement than a completely from-scratch language and web
development environment. There are drawbacks too, of course, but I think there
are some intriguing possibilites.
*** Scott Draves: Zope
zope is the closest thing to a lisp machine that i've
ever seen, but it runs as a web server, and instead of
lisp you have python, which is lisp without parens or
macros. there is now a layer on top of zope that
adds an asp toolkit oriented towards
corporate communications and publishing. the company
does consulting using the tools they support as
open source.
*** Matthew O'Connor: type-narrowing
I have a BIRD and a PLANE that each inherit from FLYING_THING.
Now some part of my code knows about FLYING_THINGs and in
there I have a list of them. I pass this list of FLYING_THINGs to
another part of my code that knows about PLANEs. In this section
of code I have to be able to take a FLYING_THING and cast
(type narrow) it to PLANE if it is indeed a PLANE.
This is what OCaml doesn't seem to allow me to do when all of the
other OO languages I know allow this.
*** Simon Britnell: running remote programs from command line
The idea is primarily that it would be easy to publish (small)
applications by placing a directory tree on an http server.
Presumably there will be some facility to load files, libraries, etc in
arc. I think that when a program is run from a url, the directory base
in the URL should effectively become the cwd for that session.
So that when loaded from http://foo.com/bar.arc:
load "baz.arc" ; reads and executes http://foo.com/baz.arc
and (excuse my poor lisp/scheme vocabulary here, I actually use scheme
very little):
open-input-port "blargh.txt" ; opens http://foo.com/blargh.txt for input
It would also be nice to be able to specify files to be loaded over http
in the arc source:
load "http://libvendor.com/graphics/charts.arc"
In this example files referred to at the top level of charts.arc will
also be loaded from http://libvendor.com/graphics. Functions defined in
charts.arc should not necessarily load from
http://libvendor.com/graphics, but rather from the cwd at the point they
were executed from.
In short: load should set the cwd within the scope of the file it's
executing to the directory that file was loaded from. It would be nice
to be able to get the directory the actual source was loaded from inside
a function too, I just don't think it should be the cwd at that point.
I think at this point a security modified version of arc becomes
important so that you can run untrusted code with confidence. OTOH,
perhaps such confidence is better acquired via chroot or some such. I
was also thinking about caching frequently used http fetched files
locally, but that's what a caching proxy is for.
The comment about having a low level (ie. framebuffer) graphics library
for easy integration into a web browser was because:
a) A low level graphics interface is important anyway for various
applications.
b) If you can already load files over http easily and your display
interface is a framebuffer, integration into a browser is relatively low
hanging fruit (replace the url fetching primitive and the framebuffer
primitive and you're done) and may appeal to those who (like me) distain
java applets as too big and clunky.
> If so there would have to be something on the client too. What?
I was thinking of having some kind of library macro that would allow you
to write the code to be executed on the server in the same source as the
code to be executed on the client. The code would be initiated on the
server and a client macro would surround the bits to transmit to the
client. I need to think more about this area however as it would be
nice for the client to be able to return values. The original thought
was that the code:
(some-server-stuff foo)
(more-server-stuff bar)
(client (some-fn param (server (baz param)) blah))
(continue-with-server-stuff)
Would execute the stuff enclosed in (client ) on the client simply by
emitting it the text of an http response and that the (server ) stuff
would be evaluated into a value by the server before being sent to the
client to play with. I don't think this is quite adequate however as
(client ) doesn't return a value and http would be an obnoxious protocol
to make it work over. Actually, an http post containing a session id
for state management and some arc to execute would almost do the job for
client returns. You'd definitely want some kind of security
restrictions on what client returned code could do.
;; Server code for input from a client side form
(update-person (client (input-form "my.form)))
Likewise, it would be nice for client code to be able to send ad-hoc
requests to the server:
(server (get-tax-code (client zip-code)))
I need to think about this some more. I've almost got it figured out.
*** Peter Armstrong: J
There's a quick way for Lispers to get the feel of the J
language. John Howland has built a course around a set
of programs parallel-coded in J and Scheme! Try --
http://www.cs.trinity.edu/About/The_Courses/cs301/
and scroll to the bottom for source listings.
I should mention also Howland's language Aprol, an attempt
to integrate the functionalities of J & Scheme. While the
project itself may be inactive, you might be interested in
his perspective on such attempts (or even on the logistics
of simply calling J from Arc -- which I would find cool!).
*** Henrik Ekelund: strings with linebreaks
Here is a simple but very good idea taken from Python.
In python you can express string literals in more than one way.
Short strings can be enclosed by either single quotes OR double quotes.
Strings can span more than one row if they are enclosed by three quotes
(single or double).
It gives very readable code:
sqlstring=""" SELECT * FROM TABLE
WHERE field='VALUE' AND
otherfield='Something' """
xmlString = ""
cplusplusString = '#DEFINE SOMECONSTANT "Enclosed in double quotes" '
*** Sudhir Shenoy: ideas from Perl
1) Please don't use parentheses for s-expressions ('[' and ']' or
even '{' and '}' would be preferable. The biggest complaint I have
about parentheses is that it makes infix mathematics (which you are
planning to introduce in Arc) really hard to read. e.g. (def interest
(x y) (expt (-r * (t2 - t1) / (days-in-year)))) reads really badly
when compared to [def interest (x y) [expt (-r * (t2 - t1) /
[days-in-year])]]. The overloading of meanings of parentheses
(grouping + s-expression) which isn't a problem in Lisp may cause a
loss of readability in Arc.
2) An idea you may want to look at (from Perl) is how the AUTOLOAD
function works. When a function is not found in a module, the
AUTOLOAD function is called with the name of the function that was
attempted to be executed. You can write code that will (depending on
the context) return the name of the correct function.
We use this routinely to provide accessors (set/get methods) on Perl
class (hash) objects. Another useful application is when you have a
container class/struct that holds specialised objects and you don't
want to duplicate the interface of the contained objects in the
container. Writing a suitable AUTOLOAD function would simply redirect
a call to the non-existent method in the container to a real function
in the appropriate child object ...
3) Again, from Perl, interpolation of objects inside strings is
incredibly useful. Perl makes a distinction between single quoted
strings (no interpolation) and double quoted strings (which have
interpolation). Although the parser and compiler have to look into
strings to see whether interpolation is to be performed, the end
result is code that is easy to read and understand.
*** Ben Yogman: calling
Calling pattern:
- There should be requirable keyword arguments.
Keywords just specify roles, right? So then not having keyword arguments
implicitly assumes that roles are naturally evident by function name and
argument position. This goes awry in practice even with just two arguments,
and the more arguments you have the worse the combinatorics. Requiring
arguments in a certain order is generally an icky thing people do to make
function calls cheaper, but the info to swizzle argument lists and the lists
themselves will in practice be in place at compile time, as generally you
funcall with one or two argument functions that don't need keyword help to
disambiguate. By specifying roles, the function calls become not only
easier to write, but also better documenting once written.
- Optional boolean keyword arguments should go directly into function calls
and result in a binding of true in the function body.
Again, this makes the calls a quicker and more natural read... just say
:verbose, :pretty, :coerce, whatever or don't say it. Many other examples,
the only issue is what special syntax for this in the function definition.
- The simple OO system's defined methods ought to have a (obj.method
), rather than having the object as the first argument.
Like with the previous point, this is to enhance readability... putting the
subject first in the expression followed by the verb mirrors a large class
of human languages.
Function meaning:
- There should be operator overloading
It's damn useful. I'm not saying string concatenation should be +
necessarily (though maybe... having a small minimal working vocabulary
really helps the ramping time), but let's say I'm doing something
commutative... adding two vectors, or matricies, or polynomials.... you get
the idea.
- There should be MI, and it should look a lot like trying to use possibly
conflicting packages together
You have the same problems and you have to solve it once, so why not use it
twice? I myself have not used MI much, but I have a little, and it really
is the right tool sometimes.
Library stuff:
- There should be a reader macro predefined for converting infix, reverse
polish, whatever... math, logical, and assignment ops to prefix complete
with a transparent operator precedence table.
Why scare away people when you don't have to? Why make people retranslate
formulas? Also, maybe people are used to HP calculators... dunno.
- Supply even the really primitve primitives, like bit shifting
Even if you don't have an implementation now that takes direct advantage,
you could at least infer a more specific type than with /2 /4 /4*N, and you
leave the door open for blazing speeds.
- Go through CLR and design patterns and have something ready-made to
describe what people know from canon
Read, find corresponding algorithm and or design pattern... fill in gaps.
Macros can kick some serious but here, especially with the design patterns.
I'd be very happy to contribute.
Random:
- There ought to be a reasonable attempt to deduce types, and part of the
information that comes back from the REP should be best guess types.
That info is directly usefull, though at some point, possibly early, it
could go a little stale through layers of indirection. Though in principle
algorithm first, continuing to declare types selectively could continue to
help, and it certainly won't hurt speed. Also, though the profiler is the
ultimate arbiter, you often know at least some of the hot spots in your code
before coding them, so you can spot possible bottlenecks.
*** Ravi Mohan: refactoring, vm
1)built in refactoring ,introspection,programming IDE
suport.
I don't know how refactoring would work in functional
languages but the refactoring browser works great in
samll talk. I guess what i am trying to say is atht
the language should be highly introspective (which
LISP is ) AND the language release should come with
few tools (like the REF browser) . These don't need to
be visual candy, the interface can be simple but i
suggest that the tools would improve from version to
version an dhave a positive feedback effect on the
development of libraries etc as they improve.This is
one of the great features of smalltalk ,one has a lot
of browsers AND one can define ones own. of course the
problem with smalltalk is that it is too toghtly bound
to its ide. with ARC this should be an optional add on
so that one could use arc without the rB, IDE etc so
one could for example use emacs .
I would once again emphaise the need for the
refactorin browserm bcause this is one tool that would
"build" in the language idioms and best practices and
with the RB itself being extensible...
2)A "standard" VM with excellent documentation.
Ideally teh VM should be written i ARC with , perhaps
the lowest layer in C for performance. This idea comes
from Squeak as well, where the "one language from top
to bottom" approach makes it easy to improve the VM
etc without switcjing to C.also there needs to be a
way of adding new C primitives to the VM (again with
mnested name scopes - see my 1 line suggesy\tion in
my last letter so one could have a scope like
vm.primitives.posix or vm.primitives.networking
3)XML throughout for documentation dataflows etc( i
guess this could be oprional ) but this is just a
suggestion
4)start a mailing group (perhaps on yahoo) to collect
these suggestions ? this would enable folks to respond
to other people's suggestions and quickl build a
community of intereseted folks.
*** Steven H. Rogers: Overloading +
I'm glad that you decided not to use + to catenate sequences. A better
use of overloading + would be to apply + individual elements a la APL,
e.g. (+ (2 3) (4 5)) evaluates to (6 8). If you're implementing strings
as lists of characters, this could make for efficient string
manipulation that goes beyond regular expressions.