The Evolution of Emergent Computation: Crutchfield T

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Proc. Natl. Acad. Sci.

USA
Vol. 92, pp. 10742-10746, November 1995
Computer Sciences

The evolution of emergent computation


JAMES P. CRUTCHFIELD*t AND MELANIE MITCHELLI
*Physics Department, University of California, Berkeley, CA 94720; and tSanta Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501

Communicated by Murray Gell-Mann, Santa Fe Institute, Santa Fe, NM, August 15, 1995

ABSTRACT A simple evolutionary process can discover sitates global information processing; and (iii) an idealized
sophisticated methods for emergent information processing in computational model of evolution.
decentralized spatially extended systems. The mechanisms One of the simplest systems in which emergent computation
underlying the resulting emergent computation are explicated can be studied is a one-dimensional binary-state cellular
by a technique for analyzing particle-based logic embedded in automaton (CA) (8)-a one-dimensional spatial lattice of N
pattern-forming systems. Understanding how globally coor- identical two-state machines ("cells"), each of which changes
dinated computation can emerge in evolution is relevant both its state as a function only of the current states in a local
for the scientific understanding of natural information pro- neighborhood of radius r. The lattice starts out with an initial
cessing and for engineering new forms of parallel computing configuration (IC) of N cell states (Os and ls). This configu-
systems. ration changes in discrete time steps according to the CA
"rule"-a look-up table mapping neighborhood state config-
Many systems in nature exhibit sophisticated collective infor- urations to update states. At each time step, all cells examine
mation-processing abilities that emerge from the individual their local neighborhoods (subject to specified boundary con-
actions of simple components interacting via restricted com- ditions), consult the look-up table, and update their states
munication pathways. Some often-cited examples include ef- simultaneously. The CA's radius places an upper boundary on
ficient foraging and intricate nest-building in insect societies the speed of information transmission through the lattice. It
(1), the spontaneous aggregation of a reproductive multicel- also limits the sophistication of the local dynamics: the number
lular organism from individual amoeba in the life cycle of the of look-up table entries is 22r'+1. Thus, fixing r << N constrains
Dictyostelium slime mold (2), the parallel and distributed the sophistication of a CA's explicit information processing.
processing of sensory information by assemblies of neurons in A simple-to-define computational task for CAs that requires
the brain (3), and the optimal pricing of goods in an economy global information processing is deciding whether or not the IC
arising from agents obeying local rules of commerce (4). contains more than half ls. We call this the Pc = 1/2 task, with
Allowing global coordination to emerge from a decentralized Pc denoting a threshold density of ls in the input. If po denotes
collection of simple components has important advantages the density of ls in the IC, the desired behavior is for all cells
over explicit central control in both natural and human- to quickly change to state 1 if po > Pc and to quickly change to
constructed information-processing systems. There are sub- state 0 if po < Pc. The Pc = 1/2 task requires global commu-
stantial costs incurred in having centralized coordination, not nication, since po is a global property of the entire lattice; no
the least being (i) speed (a central coordinator can be a linear combination of local computations-such as the cells
bottleneck to fast information processing), (ii) robustness (if computing the majority of ls in their neighborhood-can solve
the central coordinator is injured or lost, the entire system this problem. Designing an algorithm to perform the Pc = 1/2
collapses), and (iii) equitable resource allocation (a central task is trivial for systems with a central controller of some kind,
controller must be allocated a lion's share of system resources such as a standard computer with a counter register or a neural
that otherwise could go to other agents in the system) (e.g., see network with global connectivity. But it is difficult to design a
ref. 5). However, it is difficult to design a collection of decentralized, spatially extended system such as a CA to
individual components and their local interactions in a way perform this task, since there is no central counter or global
that will give rise to useful global information processing. It is communication built in. It can be shown that no finite-radius
not well understood how such apparent complex global coor- CA can perform this task perfectly across all lattice sizes (9,
dination emerges from simple individual actions in natural 10), but even to perform this task well for a fixed lattice size
systems or how such systems are produced by biological requires more powerful computation than can be performed
evolution. This paper reports the application of new methods by a single cell or any linear combination of cells. Since the ls
for detecting computation in nonlinear processes to a simple can be distributed throughout the CA lattice, the CA must
evolutionary model that allows us to address these questions transfer information over large space-time distances (-N),
directly. The main result is the evolutionary discovery of and information from distant parts of the lattice must interact
methods for emergent global computation in a spatially dis- so as to perform the computation. With r << N, such infor-
tributed system consisting of locally interacting processors. mation transmission and interaction can be accomplished only
We use the general term "emergent computation" to de- through the coordination of emergent high-level signals. Thus,
scribe the appearance of global information processing in such this task is well suited for investigating the ability of an
systems (see refs. 6 and 7). Our goal is to understand the evolutionary process to design CAs with sophisticated emer-
mechanisms by which evolution can discover methods of gent computational abilities.
emergent computation. We are studying this question in a One class of computational models of evolution are genetic
theoretical framework that, while simplified, still captures the algorithms (GAs) (11), which evolve a population of candidate
essence of the phenomena of interest. This framework requires solutions to an optimization problem by propagating the most
(i) an idealized class of decentralized system in which global "fit" candidates to the next generation via genetic modifica-
information processing can arise from the actions of simple, tions. We carried out a set of experiments in which a GA was
locally connected units; (ii) a computational task that neces- used to evolve one-dimensional binary-state r = 3 CAs (with
The publication costs of this article were defrayed in part by page charge Abbreviations: CA, cellular automaton; IC, initial configuration;
payment. This article must therefore be hereby marked "advertisement" in GA, genetic algorithm.
accordance with 18 U.S.C. §1734 solely to indicate this fact. tTo whom reprint requests should be addressed.

10742
Computer Sciences: Crutchfield and Mitchell Proc. Natl. Acad. Sci. USA 92 (1995) 10743

spatially periodic boundary conditions) to perform the Pc = scale well with lattice size on the pc = 1/2 task. This is shown
1/2 task. This GA, while highly idealized, contained the in Table 1 for a typical block-expanding rule discovered
4)exp

rudiments of selection and variation: crossover and mutation by the GA.


worked on the genotype (the 128-bit string encoding the CA A major impediment for the GA was an early breaking of
look-up table), whereas selection was according to the fitiwss symmetries in the Pc = 1/2 task for short-term gain in fitness
of the phenotype (the CA's spatiotemporal behavior on JiV by specialization for high or low density (9, 12). This and other
= 149 cell lattice). The GA started out with an initial popu- impediments seemed to indicate that this evolutionary system
lation of 100 strings ("rules") randomly generated with a was incapable of discovering higher performance CAs. How-
uniform distribution over the fraction of ls in the string. The ever, we subsequently discovered that in 7 of 300 runs, the GA
"fitness" of each rule was computed by iterating the corre- evolved significantly more sophisticated methods of emergent
sponding CA on 100 randomly chosen ICs uniformly distrib- computation. Again, the evolution proceeded via a series of
uted over po E [0, 1], half with po < pc (correct classification: epochs connected by distinct computational innovations. (A
all Os) and half with po > Pc (correct classification: all ls), and detailed analysis of the evolutionary history will be presented
by recording the fraction of correct classifications performed elsewhere.) 9PN,1o4(4O) values for three values of N are shown in
in a maximum of slightly more than 2N time steps. The fittest Table 1 for the best rules (011102, 417083, 4)loo) in three of these
strings in the population were selected to survive and were runs. The higher 9PN,104(o) values and the improved scaling
randomly paired to produce offspring by crossover, with each with increasing N indicates a new level of computational
bit in the offspring subject to a small probability of mutation. sophistication above that of the block-expanding rules. Also
This process was iterated for 100 "generations"-a "run"- given for comparison are two human-designed CAs: 4)maj
with fitnesses estimated from a new set of ICs at each computes the local majority of ls in the neighborhood and,
generation. Three hundred runs were performed starting with since it maps almost all configurations to small stationary
different random-number seeds. Details and justification of blocks of ls and Os, has 91N,104(0) = 0.000 for all N; GKL, one
the experimental procedure have been given in ref. 9. of the best performing rules known, has the highest perfor-
As reported previously (9, 12), the GA evolution proceeded mance listed, though it was constructed not for the Pc = 1/2
through a succession of computationally distinct epochs. On task but for a study of ergodicity and reliable computation in
most runs the end result was one of two computational CAs (13). Space-time diagrams illustrating the behavior of
strategies: settle to a fixed point of all Os (is), unless there is 4)17083 and 4iooare given in Fig. 1. The space-time behavior of
a sufficiently large block of adjacent or almost adjacent ls (Os) 41oo is remarkably similar to that of GKL (see ref. 12). Its lower
in the IC; if so, expand that block. These strategies rely on the performance arises from factors such as asymmetries in the
presence or absence of blocks as predictors of p0. They do not rule table.
count as sophisticated examples of emergent computation in How are we to understand the emergent computation these
CAs: all of the computation is done locally in identifying and more successful CAs are performing? In previous work (14-
then expanding a sufficiently large (- 2r + 1) block. After each 16), we developed automated methods for discovering com-
run we computed a measure of the quality of the best rules in putational structures embedded in space-time behavior. Like
the final generation: the "unbiased performance" 9P149,104(4), many spatially extended natural processes, cellular automata
which is the fraction of correct classifications performed by configurations often organize over time into spatial regions
rule within approximately 2N time steps with N = 149 over that are dynamically homogeneous. Typically, the discovery of
104 ICs randomly chosen from an unbiased distribution over p. the underlying regularities requires automated inference meth-
The unbiased distribution meant that most ICs had pO 1/2. ods. Sometimes, though (e.g., Fig. 1), these regions are obvious
These are the most difficult cases, and thus gPN,104(4) gives a to the eye as "domains": regions in which the same recurring
lower boundary on other measures of a rule's performance. "pattern" appears. To understand this phenomenon and to
The highest measured Q1g49,104(0) for block-expanding rules automate its discovery, the notion of "domain" was formalized
was 0.685 ± 0.004. Performance decreased dramatically for (15) by adapting computation theory to CA dynamics. There,
larger N, since the size of the block to expand and the velocity a domain's "pattern" is described by using the minimal deter-
of expansion was tuned by the GA for N = 149 (see ref. 9). In ministic finite automaton (DFA) (17) that accepts all and only
general, any rule that relies on spatially local properties will not those configurations that appear in the domain. Such domains
Table 1. Measured values of 9PN,104(o) at various N for six different r = 3 rules, the middle four discovered
during different runs of the GA
Rule table
CA (r = 3) Symbol hexadecimal code 9P149,104 9P599,104 9'999,104
Majority Omaj 00010117 01171777
01171777 177f7fff 0.000 0.000 0.000
GA-discovered
Expand 1-blocks 4)exp 05054083 05c90101
200bOefb 94c7cff7 0.652 0.515 0.503
Particle-based 4)11102 10000224 41170231
155f57dd 734bffff 0.742 0.718 0.701
4)17083 03100100 lfaOO013
331f9fff 5975ffff 0.755 0.696 0.670
41oo 05040587 05000f77
03775583 7bffb77f 0.769 0.725 0.714
GKL ()GKL 005fOO5f 005fO05f
005fff5f 005fff5f 0.816 0.766 0.757
For N = 149, the standard deviation is 0.004; it is higher for larger N. 4exp expands blocks of is; 0)maj computes
the majority of is in the neighborhood; all of the other rules implement more sophisticated strategies involving
particle interactions. To recover the 128-bit string giving the CA look-up table outputs, expand each hexadecimal
digit to binary. The neighborhood outputs then are given in lexicographic order starting from neighborhood
0000000 at the leftmost bit in the 128-bit binary string. 4GKL is a rule designed by Gacs-Kurdyumov-Levin (see
ref. 13).
10744 Computer Sciences: Crutchfield and Mitchell Proc. Natl. Acad. Sci. USA 92 (1995)

Time %fff Time

148 148
0 Sitc 148 0 Sitc 148
FIG. 1. Space-time diagrams showing the behavior of two CAs, discovered by the genetic algorithm on different runs, that employ embedded
particles for the nonlocal computation required in density classification. Each space-time diagram plots lattice configuration iterates over a range
of time steps, with is given as black cells and Os as white cells; time increases down the page. Both start with the same initial configuration (po
0.483). (Left) 417083 correctly classifies this low-p configuration by going to a fixed all-Os configuration by time 250 (not shown) after the gray
region dies out. (Right) In contrast, CA 4'oo misclassifies it by going to all is, despite its better average performance.
are called "regular," since their configurations are members of be filtered to remove the domains. The result, shown in Fig. 2,
the regular language recognized by the DFA. More precisely, reveals the particles and their interactions. Table 2 lists the six
a regular domain A is a set of configurations that on an infinite particle interactions that have been identified. The filtering
lattice is temporally invariant, A = +(A), and whose DFA has analysis reveals a particle-based logic that emerges over time
a single recurrent set of states that is strongly connected. and supports the required computational processing-
Regular domains play a key role in organizing both the information storage and propagation over large space-time
dynamical behavior and the information-processing properties of distances, logical operations, and so on-necessary for high
CAs. Once a CA's regular domains have been detected (i.e., that fitness in approximating density classification. Roughly, 017083
level of structure has been understood), nonlinear transducers successively classifies "local" densities with a locality range
(filters) can be constructed to remove them, leaving just the that increases with time. In regions where there is some
deviations from those regularities. The resulting filtered space- ambiguity, signals (in the form of particles) are propagated,
time diagram reveals the propagation of domain "walls." If these indicating that the classification is to be made at a larger scale
walls remain spatially localized over time, then they are called via particle interactions. Two examples of such interactions are
"particles" (16). (We emphasize that such embedded particles are shown in Fig. 2 Left and explained in the legend.
qualitatively different from those exhibited by CAs that have been There are a number of constraints imposed by the "cellular"
hand-designed to perform computations (e.g., see refs. 18-21). nature of CAs that the GA balances in its evolutionary search
Embedded particles are a primary mechanism for carrying in- for high fitness. First, classification of local configurations with
formation (or "signals") over long space-time distances. This ambiguous density must be deferred to later times and larger
information might indicate, for example, the result of some local spatial scales to provide a context in which information is
processing that has occurred at an early time. Logical operations available to disambiguate the local classification. Second,
on the signals are performed when the particles interact. The signals are required in the form of propagating particles, since
collection of domains, domain walls, particles, and particle inter- local operations at later times have to be spatially local:
actions for a CA represents the basic information-processing decisions are made when particles come within an interaction
elements embedded in the CA's behavior-the CA's "intrinsic" range set by the CA radius. Third, the particle interactions
computation. must be built into the look-up table, which adds constraints
CA 417083 of Fig. 1 Left has three domains {A°, Al, A2}, that are nonlocal in the genomic representation and that must
which are given in Table 2. There are five stable particles {a, be compatible with domain stability and particle propagation.
y, 8, q, and ,t} and one unstable "particle" {,B} defined in Fourth, the particles must be stable to preserve information
Table 2 as walls between two domains. Note that, given the CA over space-time. The result is a delicate balance that must be
rule code (Table 1), it can be proved that the domains are maintained by the GA in a CA look-up table that supports
time-invariant sets and that the stable particles are spatially sophisticated particle-based information processing. Given
localized time-invariant sets for the corresponding CA (16). these constraints, which are nonlocal and require specific
With this knowledge, the space-time diagram of Fig. 1 Left can output bit settings in the rule table string, it is striking that the
Table 2. The domains, particles, and particle interactions that support the emergent logic in the CA (017083) shown in Fig. 1 Left
Particle Particle interaction (by type)
Domain Symbol Velocity Graphics Annihilation Decay Reaction
AO°=O* a AAA0
- 1 vy. + - * 0+ n a +6 -*j
Al = 1* f3A0A1 0 0 ,+ -*0 Z + y a
A2 = (10001)* ly A2A0 -2 q1 + a-
8 A0A2 '/2
A2A1 4 /34\8ee
A'A2 3.
(et)* means any number of repetitions of string w. The table includes only those structures that dominate the CA's spatiotemporal behavior. Very
infrequently occurring structures, such as the checkerboard domain A3 = (10)* and the four dislocations within A2 are not listed because they do
not contribute measurably to the CA's classification performance. Under "Particles," the graphic associated with each particle provides a key to
Fig. 2. Note that the structure of each particle's graphic is determined by the nonlinear transducer. 0 denotes spatial configurations without particles.
Computer Sciences: Crutchfield and Mitchell Proc. Natl. Acad. Sci. USA 92 (1995) 10745

°0
Time

148
0 site 148

FIG. 2. Analysis of the emergent logic for density classification in CA 4)17083 (Fig. 1 Left). This CA has three domains, six particles, and six particle
interactions, as noted in Table 2. (Left) This figure gives the same space-time diagram as in Fig. 1 Left except that the domains have been filtered
out by using an 18-state nonlinear transducer constructed as described in ref. 16. The resulting diagram reveals the particle interactions that support
the long-range spatiotemporal correlation for density classification at the associated performance level (Table 1). (Right Upper) The particle
interaction a + 8 --> IL implements the "logic" of mapping a spatial configuration representing high, low, and then ambiguous densities to a
high-density signal A. (Right Lower) Similar detail is shown for the particle interaction ,u + -y -- a that maps a configuration representing high,
ambiguous, and then low density to an ambiguous-density signal a.
GA evolved particle-based computation that performed nearly indicate nature has been confronted by analogous design tasks
as well as the best-performing human-designed CA. and solved them, to date only human-designed CAs have been
The particle-based computation analysis also indicates why used for performing such computations (see refs. 19-21). In
the CAs discovered by the GA, as well as the human-designed contrast to the engineering approach of building particles and
CA, fail to achieve higher 9TN,1o4(4). One reason, of course, is their interactions into CAs, a key tool in our analysis was the
that the emergent logic can be incorrect. Even small errors in ability to detect structures embedded in CA spatiotemporal
the particle velocities or interactions, for example, are com- behavior that support emergent computation.
pounded over time and lead to misclassifications. More im- A simple, but general lesson was learned: when confronted
portantly, at the very earliest iterations, before the CA behav- with constraints, evolutionary processes need to innovate
ior has condensed into configurations consisting only of do- qualitatively new mechanisms that transcend those constraints.
mains and particles, local configurations larger than the The locality of communication in CAs imposes a constraint on
neighborhood size can lead to incorrect positioning and se- communication speed. The GA's innovation was to discover
lection of domains. The ensuing emergent logic operates on CAs that performed information processing over large space-
these errors and, even if it is correct, produces a misclassifi- time distances using particles and their interactions-a wholly
cation. In this way, the analysis methods of refs. 15 and 16 allow new level of behavior that is distinct from the lower level of
us to explain how particular mechanisms in CAs lead to spatial configurations. In this way, our analysis of particle-
increased fitness and so survivability under the GA. based computation demonstrated how complex global coor-
From the perspective of engineering applications, the par- dination can emerge within a collection of simple individual
ticular GA used here was not an efficient automated designer actions. In a complementary fashion, our GA simulations
of particle-based computation, since the rate of production of demonstrated how an evolutionary process, by taking advan-
these CAs is low, though reliable. A primary impediment is the tage of certain nonlinear pattern-forming propensities of CAs,
GA's breaking of symmetries in early generations for short- can produce this new level of behavior through a succession of
term fitness gain. This resulted in the populations' move to innovations that build up the delicate balance necessary for
asymmetric, low-performance block-expanding CAs. Repair- effective emergent computation.
ing the broken symmetries required an unlikely coordinated
change in a large number of look-up table bits. We have We thank Rajarshi Das and Peter Hraber for many contributions to
proposed (9) a number of improvements to the GA, including this project. This research was supported in part at the University of
the design of GA fitness functions and genomic representa- California at Berkeley by Air Force Office of Scientific Research
tions that respect known task symmetries, but we also noted Grant 91-0293 and Office of Naval Research Contract N00014-95-1-
that symmetry-breaking may be a necessary part of some 0524 and at the Santa Fe Institute by National Science Foundation
Grant IRI-9320200, Department of Energy Grant DE-FGO03-
evolutionary pathways. On the subset of runs on which parti- 94ER25231, and the Adaptive Computation and External Faculty
cle-based CAs were evolved, the GA was able to respect the programs.
symmetries necessary for higher performance and better scal-
ing; this result and the success of our analysis of embedded 1. Pasteels,-J. M. & Deneubourg, J. L., eds. (1987) Experientia 54,
computation are encouraging for the prospect of evolving Suppl.
more powerful particle-based computational systems for real- 2. Devreotes, P. (1989) Science 245, 1054-1058.
world tasks. Moreover, in work that will be reported elsewhere, 3. Rumelhart, D. E., Hinton, G. E. & McClelland, J. L. (1986) in
the GA discovered perfectly performing CAs (on a high Parallel Distributed Processing, eds. Rumelhart, D. E., McClel-
fraction of runs) that used particle-based computation on a land, J. L. & the PDP Research Group (MIT Press, Cambridge),
different task: to rapidly achieve stable global synchronization Vol. 1, pp. 45-76.
4. Fama, E. F. (1991) J. Finance 46, 1575-1617.
between local processors. 5. Milgrom, P. & Roberts, J. (1992) Economics, Organization, and
The main result reported here is a simplified evolutionary Management. (Prentice-Hall, Englewood Cliffs, NJ).
process's discovery of methods for emergent global computa- 6. Forrest, S. (1990) Physica D 42, 1-11.
tion in a spatially distributed system consisting of locally 7. Ctutchfield, J. P. (1994) in Complexity: Metaphors, Models, and
interacting processors. Despite numerous phenomena that Reality, Santa Fe Institute Studies in the Sciences of Complexity,
10746 Computer Sciences: Crutchfield and Mitchell Proc. Natl. Acad. Sci. USA 92 (1995)

eds. Cowan, G., Pines, D. & Melzner, D. (Addison-Wesley, 15. Hanson, J. E. & Crutchfield, J. P. (1992) J. Stat. Phys. 66,
Reading, MA), Vol. 19, pp. 479-497. 1415-1462.
8. Wolfram, S. (1984) Nature (London) 311, 419-424. 16. Crutchfield, J. P. & Hanson, J. E. (1993) Physica D 69, 279-301.
9. Mitchell, M., Crutchfield, J. P. & Hraber, P. T. (1994) Physica D 17. Hopcroft, J. E. & Ullman, J. D. (1979) Introduction to Automata
75, 361-391. Theory, Languages, and Computation (Addison-Wesley, Reading,
10. Land, M. & Belew, R. K. (1995) Phys. Rev. Lett. 74, 5148-5150. MA).
11. Holland, J. H. (1992) Adaptation in Natural and Artificial Systems 18. Lindgren, K. & Nordahl, M. G. (1990) Compl. Syst. 4, 299-318.
(MIT Press, Cambridge), 2nd Ed. 19. von Neumann, J. (1966) Theory of Self-Reproducing Automata
12. Mitchell, M., Hraber, P. T. & Crutchfield, J. P. (1993) Compl. (Univ. of Illinois Press, Urbana).
Syst. 7, 89-130. 20. Smith, A. R. (1972) J. Comput. Syst. Sci. 6, 233-253.
13. Gacs, P. (1985) Contemp. Math. 41, 125-134. 21. Steiglitz, K., Kamal, I. & Watson, A. (1988) IEEE Trans. Comput.
14. Crutchfield, J. P. & Young, K. (1989) Phys. Rev. Lett. 63, 105-108. 37 (2), 138-145.

You might also like