A century of generalization
We review data from both ethology and psychology about generalization, that is how animals respond to
sets of stimuli including familiar and novel stimuli. Our main conclusion is that patterns of general-
ization are largely independent of systematic group (evidence is available for insects, fish, amphibians,
reptiles, birds and mammals, including humans), behavioural context (feeding, drinking, courting, etc.),
sensory modality (light, sound, etc.) and of whether reaction to stimuli is learned or genetically inherited.
These universalities suggest that generalization originates from general properties of nervous systems, and
that evolutionary strategies to cope with novelty and variability in stimulation may be limited. Two
major shapes of the generalization gradient can be identified, corresponding to two types of stimulus
dimensions. When changes in stimulation involve a rearrangement of a constant amount of stimulation
on the sense organs, the generalization gradient peaks close to familiar stimuli, and peak responding is
not much higher than responding to familiar stimuli. Contrary to what is often claimed, such gradients
are better described by Gaussian curves than by exponentials. When the stimulus dimension involves a
variation in the intensity of stimulation, the gradient is often monotonic, and responding to some novel
stimuli is considerably stronger than responding to familiar stimuli. Lastly, when several or many familiar
stimuli are close to each other predictable biases in responding occur, along all studied dimensions. We
do not find differences between biases referred to as peak shift and biases referred to as supernormal
stimulation. We conclude by discussing theoretical issues.
The study of how external stimuli affect behaviour has of generalization or theories that more or less directly
been referred to as the theory of stimulus selection in summarize observations (for example by incorporating
ethology and stimulus control in experimental psy- observed features of generalization into the theory, Hull
chology, and has played a key role in both disciplines 1943; Mackintosh 1974). In this review we aim to organ-
during the 20th century. A key finding of such research is ize existing data in a way useful to develop and test
generalization: if a behaviour has been established in theories of generalization. We also point out findings that
response to a stimulus, novel stimuli resembling the first conflict with existing theories, and we conclude with a
one will usually elicit the same response. Usually, modi- discussion of theoretical issues. Finally, we hope that this
fied stimuli are less effective than familiar ones, but review will be helpful as a guide of what to expect when
sometimes they are even more potent in evoking the reactions to novel stimuli are important, for example in
response. This finding has been referred to as ‘supernor- experimental design.
mal stimulation’ by ethologists, ‘peak shift’ by psycholo-
gists, and more recently ‘response bias’ (see, respectively, APPROACHES TO THE STUDY OF
Tinbergen 1951; Mackintosh 1974; Enquist & Arak 1998). GENERALIZATION
Interest in theories of generalization seems to have faded
in recent years, although our understanding is still unsat- Ethology and Experimental Psychology
isfactory (Mackintosh 1974; Ghirlanda & Enquist 1999).
Data about generalization come primarily from etho-
Behaviour is often ‘explained’ by merely empirical rules
logical and psychological studies of behaviour. Within
(S ). The S may be absence of S + or a stimulus differing Table 1. Contribution of rearrangement and intensity of stimulation
from S + in characteristics such as visual size or sound along common stimulus dimensions
frequency. More complex arrangements, with several Contribution of
positive and negative stimuli, have also been used. After
training, the animals’ reactions to a set of test stimuli are Stimulus dimension Rearrangement Intensity
recorded. These are usually chosen from a ‘stimulus
dimension’ obtained by varying a physical variable such Intensity of sound L
as wavelength of light or intensity of sound. Data can Intensity of light L
thus be represented in the form of a response gradient Chemical concentration L
along the dimension; this is the chief analytical tool of (smell, taste)
the psychological tradition, from which key concepts Object size L L
Complex sound spectra L L
such as ‘peak shift’ and ‘stimulus control’ are defined (e.g.
Complex light spectra L L
Terrace 1966). The test stimuli typically include S + . If S Object shape L S
lies on the test dimension, too, it is customary to speak of Tone frequency L S
an ‘intradimensional’ discrimination, for example when Monochromatic light L S
light wavelength generalization is studied after a discrimi- Object orientation L
nation between two wavelengths. If S cannot be placed Object location L
on the test dimension, the term ‘interdimensional’ dis-
L: Large contribution; S: small contribution, or varying contribution
crimination is used, for instance if the test dimension is depending on the exact stimuli used; no symbol means negligible
sound frequency and S is silence or white noise. The contribution.
same discrimination can be both intra- and inter-
dimensional depending on the test stimuli chosen. We
thus prefer to speak of inter- and intradimensional tests, objective properties of stimuli. One drawback of this
rather than discriminations. approach is that it ignores how stimuli are received by
Whereas psychologists are almost exclusively based in the sense organs. For instance, focusing on wavelength
the laboratory, ethologists have mainly studied behaviour of light does not explain why ultraviolet light cannot
in nature. Although animals are seldom trained by the control behaviour in some animals. But if we consider
researcher, discriminations are common in the wild as photoreceptors, we discover that some animals have
well. For example, to incubate their eggs birds must none that react to ultraviolet light. Here we focus on two
discriminate between the egg and the nest background; to aspects of sense organ activation patterns. The first is the
pursue females male butterflies must discriminate their intensity of stimulation, by which term we mean the
presence from their absence, and so on. The main total activation of receptors. This is related to physical
research tool in ethology has been the use of dummies intensity but is not identical with it; for instance, a sound
resembling natural stimuli but with added, removed or of 100 kHz does not elicit any activation in human ears
modified features. For instance, the egg retrieval behav- regardless of its physical intensity. The second aspect is
iour of the herring gull, Larus argentatus, has been studied how a given amount of stimulation is distributed among
using dummy eggs of different sizes, colours and shapes receptors. We refer to stimulus dimensions along which
(Baerends 1982). intensity does not change as ‘rearrangement dimensions’,
We can thus summarize both the ethological and the since different stimuli along such dimensions correspond
psychological methods as the recording of animals’ reac- to a different arrangement of the same amount of
tions to stimulus sets including novel and familiar stimulation on the sense organs. Table 1 shows how
stimuli. In interpreting results from such experiments, it common stimulus dimensions can be classified according
is important to remember that behaviour is influenced to this scheme. The rationale for such a classification is
both by individual experiences and by the evolutionary that variation in intensity has significantly different
history of the species. Even if the stimuli used in the behavioural effects, compared with rearrangement of
laboratory have little significance for animals in the wild, stimulation (see below).
pre-existing responses and predispositions can influence
behaviour. For instance, novel stimuli can elicit fear. DATA SELECTION
Thus, laboratory experiments only approximate the ideal
situation of an empty memory modified by experience Although experimental paradigms in generalization
with only one or two stimuli, even when ‘naive’ individ- research can be summarized succinctly in their main
uals are used. The analysis of natural behaviour is even points, countless variations exist. We have tried to
more complex, both because the animals’ evolutionary include as many studies as possible in our analyses, but
history more directly influences behaviour and because we deemed some unsuitable. Our guidelines may be
experimental control over individual experiences is at summarized as follows. First, we included no study with
most partial. fewer than three subjects per treatment, except when we
considered the effect of group size. Second, responding
should be probed at a sufficient number of stimulus
Analysis of Stimulus Dimensions
locations to make inferences about gradient shape. When
In both ethology and comparative psychology, results fitting a curve to the data, we required that more than
of generalization tests are most often analysed in terms of 10 stimulus locations be probed. One exception is the
analysis of response biases, where responding to a single Staddon & Reid 1990; Cheng et al. 1997). We analysed
test stimulus may serve to estimate a bias. Third, we have 223 rearrangement gradients and found that Gaussian
not included studies whose outcome was importantly curves account for about 3% more of the variance in
affected by unusual features of training or testing, apart observed data (estimated by r2; appendix 2 describes the
from when we discuss the specific effects of such features fitting procedure). This difference is small, but significant
(see especially intensity dimensions below). Fourth, we (Test 1, P<10 10). Both Gaussians and exponentials
could not include some studies in quantitative analyses account for more than 90% of the variance in most cases,
because they reveal generalization indirectly, for instance and about 25% of the gradients conform better to the
by studying how reproduction or position in a social exponential shape. Figure 2 illustrates these results.
hierarchy is affected by signals used in social interactions The experiments we surveyed were designed to give an
(e.g. Burley et al. 1982; Burley 1986; Johnson et al. 1993). overall picture of generalization gradients rather than to
Nevertheless, these are powerful examples of the biologi- decide between two specific hypotheses. The region
cal significance of generalization. Lastly, we do not around the peak, most important to discriminate expo-
include temporal generalization, owing to our lack of nentials from Gaussians, is often poorly sampled. Better
familiarity with the field and the additional space it sampling appears to favour Gaussian fits, as shown in
would require. Fig. 3 relative to light wavelength generalization in the
We turn now to reviewing the available data on gener- pigeon, Columba livia (see for instance the data from
alization, considering rearrangement dimensions first, Blough 1975 in Fig. 1). Note also that fitting attempts are
then intensity gradients, and lastly dimensions along evaluated only in the light of sampled values. Strictly, we
which both the amount and the arrangement of stimu- cannot say anything about other values. Yet we do not
lation on the sense organs vary. In order not to burden expect the actual gradient to depart systematically from a
the text with the description, results and data sources of good fit. Inspection of Gaussian and exponential fits (e.g.
statistical tests, we have collected this information in Fig. 1) shows that exponential fits often predict a con-
Appendices 1 and 2. In the following, we refer to tests by siderably taller gradient than actually observed (an aver-
their number in Appendix 1. A short summary concludes age of 20% higher, Fig. 4a). Predictions from Gaussian
each major section. fits, on the other hand, are distributed around the
observed values (Fig. 4b). This suggests that Gaussian fits
estimate gradient height more accurately. There is instead
REARRANGEMENT DIMENSIONS no difference between predictions of Gaussian and expo-
nential fits about the location of the peak (Test 3, NS).
Generalization gradients peaking at or near the positive
Finally, we note that gradients that are clearly neither
stimulus are considered the prototypical finding about
exponential nor Gaussian exist (e.g. the bottom gradients
generalization (Fig. 1). Such gradients have been found
in Fig. 1; Hoffman & Fleshler 1964; Blough 1972).
along diverse stimulus dimensions such as light wave-
length, tone frequency, object orientation and object
location (Table 2). Stimuli along these dimensions are Gradient Symmetry
best described as corresponding to a rearrangement of
stimulation with respect to the S + , without much change Nearly all theories of generalization assume or predict
in the total activation of sense organs. For instance, all that gradients obtained from interdimensional tests are
positions and orientations of lines or squares in the symmetrical around the S + (Spence 1937; Hull 1943;
(centre of the) visual field give rise to the same amount of Blough 1975; Shepard 1987). However, reproducible
stimulation in the eye. This is also true of tones of the asymmetries have been reported. When data from all
same physical intensity and not too different frequencies subjects taking part in an experiment are published, we
(e.g. human hearing, Coren et al. 1999). Variation in light can test whether they are consistently skewed towards
wavelength can be classified as a rearrangement dimen- one or the other side of the S + (see Appendix 2 for
sion as well, that is, total receptor activation is approxi- details). For instance, individual gradients in Hearst et al.
mately constant over considerable wavelength ranges in (1964, pigeon, line orientation, S + =vertical line, S =no
many species. line) are skewed on the side of clockwise rotations (Test 4,
When individual data are not published, we can gather
Gradient Shape group averages from different studies conducted under
similar conditions, and look for a systematic across-study
Spence (1937), in his pioneering work on stimulus asymmetry. We can thus confirm that the the skewed
control, used parabolic functions to introduce the con- gradient reported by Hearst et al. (1964) has been consist-
cept of a generalization gradient. In later work, he also ently observed in other studies (Test 4, P<10 4);
assumed bell-shaped gradients (Spence 1942). He responses to anticlockwise tilted lines are on average 92%
acknowledged that these choices were purely illustrative, of responses to clockwise tilted ones (range 76–104%).
lacking at the time reliable data. Hull (1943), based on Similarly, an analysis of studies of light wavelength
Hovland’s (1937) data, incorporated an exponential generalization in pigeons reveals that generalization
function into his theory of behaviour. Nowadays around S + =550 nm is consistently skewed (Test 5,
researchers mostly hold gradients to be either expo- P<10 6). Wavelengths shorter than 550 nm total an
nential or Gaussian (e.g. Blough 1975; Shepard 1987; average of 67% of the responses to longer wavelengths
0.05 0.1
0 0
570 580 590 600 610 570 580 590 600 610
Light wavelength (nm) Light wavelength (nm)
0.4 0.6
0.3 0.45
0.2 0.3
0.1 0.15
0 0
460 480 500 520 540 560 580 600 620 640 500 520 540 560 580 600
Light wavelength (nm) Light wavelength (nm)
0.05 0.2
0 0
450 460 470 480 490 500 510 0 20 40 60 80 100
Light wavelength (nm) Line orientation (degrees from vertical)
Figure 1. Examples of rearrangement generalization gradients, with Gaussian and exponential fits where appropriate. Sources are given above
each graph.
(range 48–94%). Observations of asymmetrical gradients versa. Considering the sense organs can help us under-
also exist for humans. For instance, Kalish (1958), stand why a gradient is symmetrical on a given scale. For
Thomas & Mitchell (1962) and Thomas & Bistey (1964), instance, in the case of sound, gradients appear most
reported marked asymmetries in every one of 12 groups symmetrical on a logarithmic frequency scale, in keeping
of 20 subjects each generalizing along the dimension of with physiological evidence that sound frequency must
light wavelength. Some gradients even show strong change exponentially to yield changes in activation pat-
response biases, despite resulting from interdimensional terns of constant magnitude in the ear (Kandel et al.
tests (see e.g. the data from Thomas & Bistey 1964 in 1991; Coren et al. 1999). For many dimensions de facto
Fig. 1). Furthermore, Thomas & Bistey (1964) reported standards about which scale to use have emerged
increasing asymmetry as more of the test dimension was that agree with these considerations based on sensory
sampled (symmetrically around the S + ), a challenging physiology. We return to this issue below.
result for theories of generalization (analysis of variance
in the original study: F4,95 =11.00, P<0.001).
Response Biases
Obviously, gradient symmetry depends on the scale
chosen along a dimension. Symmetry on a linear scale After training a discrimination between two stimuli
will be destroyed passing to a logarithmic one, and vice differing along the test dimension (intradimensional
Light spectra
Monochromatic light No data Human Kalish 1958
Yes Pigeon, Columba livia Hanson 1959
Yes Goldfish, Carassius auratus Ames & Yarczower 1965
Nonmonochromatic light* Yes Goldfish Ohinata 1978
Colour of female dummy* Yes Glow-worm, Lampyris noctiluca Schaller & Schwalb 1961
Yes Glow-worm, Phausis splendidula Schaller & Schwalb 1961
No Butterfly, Argynnis paphia Magnus 1958
Colour of egg dummy* Yes Herring gull, Larus argentatus Baerends 1982
Orientation of
Line Yes Pigeon Bloomfield 1967
Head stripe Yes Fish, Haplochromis burtoni Heiligenberg et al. 1972
Rocket picture Yes Human children Nicholson & Gray 1971
Sound frequency
No data Goldfish Fay 1970
Yes Rat, Rattus r. norvegicus Brennan & Riccio 1972
Yes Human Baron 1973; Galizio 1985
Yes Pigeon Klein & Rilling 1974
Location in space Yes Pigeon Cheng et al. 1997
No data Honeybee, Apis mellifera Cheng 1999, 2000
For laboratory studies, the column ‘Response bias’ refers to intradimensional tests. See the main text for
information about biases in other conditions.
*Nonmonochromatic lights cannot be meaningfully aligned along a single dimension, and intensity effects may be
caused by different sensitivity of the receptors to different wavelengths of light.
Difference in % variance accounted for
Variance explained by fitted exponential curve (%)
r96 = –0.55
90 –9
–10 P < 10
0 5 10 15 20
85 Most common separation between test stimuli (nm)
Figure 3. Differences in percentage of variance accounted for
between Gaussian and exponential fits to light wavelength general-
ization data in pigeons, as a function of separation between test
stimuli (see Test 2, Appendix 1, for data sources and statistics).
80 Similar results are obtained when considering average separation
80 85 90 95 100 between test stimuli rather than the most common one. Note that
Variance explained by fitted Gaussian curve (%) most studies adopt a sampling step of about 10 nm, where a large
Figure 2. Comparison of Gaussian and exponential fits with empiri- variation is observed.
cal generalization gradients. Points above (below) the diagonal
represent empirical gradients better fitted by exponential (Gaussian)
curves. little attention. Sometimes a ‘negative’ bias is also
observed, that is, lower responding than to S to stimuli
that are further away from S + . This effect is apparent
tests), authors have often found response biases (Table 2). when S elicits a considerable number of responses even
That is, there exist stimuli that elicit stronger responding after discrimination training (Fig. 5b; Stevenson 1966;
than S + (Fig. 5a). These stimuli are, almost invariably, Wills & Mackintosh 1998). Response biases along rear-
located further away from S . Recall that response biases rangement dimensions also occur in nature, as reported
can also appear in interdimensional tests (e.g. Kalish in the ethological literature about ‘supernormal stimuli’
1958; Thomas & Bistey 1964), a finding that has received (Tinbergen 1951; Eibl-Eibesfeldt 1975; cf. Table 2 and
40 0.4
(a) (a)
Proportion of responses
Average = 0.81
30 0.3
20 0.2 +
10 0.1
Number of studies
0 0
0.5 0.75 1 1.25 500 520 540 560 580 600
Observed height/height of fitted exponential Wavelength (nm)
80 S
0.5 0.75 1 1.25 0
Observed height/height of fitted Gaussian 2 4 6 8 10 12 14 16 18 20
Stimulus number
Figure 4. Predictions about gradient height by (a) exponential and
(b) Gaussian fits to rearrangement gradients, compared with the Figure 5. Examples of response biases along rearrangement dimen-
observed height. Data sources in Test 1, Appendix 1. sions. (a) Hanson’s (1959) classical study. Pigeons were trained to
peck a key for food when it was lit with a 550-nm light, but not
when the wavelength was 570 nm. In a subsequent generalization
Generalization of inherited and learned behaviour, test the maximum number of responses occurred at the 540-nm
light. (b) A gradient showing response biases both left of S + and
right of S − , from Guttman (1965). He first trained pigeons to peck at
In field studies it is often difficult to know what all the wavelengths to be tested, then introduced a discrimination
experience an animal has had. Under laboratory con- training by presenting only S + (still reinforced) and S − (unrein-
ditions, however, there are a few well-established facts forced). The following generalization tests reveal that some stimuli
about the effects of previous experience on biases. A to the right of S − are reacted to less than S − .
general finding is that both the strength of the bias
(maximum responding relative to responding to S + ) and
the distance of the most effective stimulus from S + Conclusions
increase when the S + and S come closer. In Fig. 6 we
show these effects in the pigeon, along the light wave- (1) Generalization gradients along rearrangement
length dimension. The pattern is the same in all studies dimensions are described better by Gaussian than by
where the separation between S + and S has been varied. exponential functions (Test 1, Fig. 4).
Examples are Hearst (1968, line-tilt, pigeon), Ohinata (2) When the rearrangement dimension includes one
(1978, wavelength, goldfish), Baron (1973, tone fre- S + but no S , the gradient typically peaks at S + (but not
quency, humans) and Cheng et al. (1997, spatial location, invariably, Fig. 1), and the gradient is typically symmetri-
pigeon). cal around S + ; however, reproducible asymmetries exist,
The variation in Fig. 6 reveals that similarity between and may be more common than usually assumed (Tests
test stimuli is not the sole determinant of response biases. 4, 5).
There are ample indications that training and testing (3) When the rearrangement dimension includes both
procedures, as well as the characteristics of stimuli used, one S + and one S , responding is biased: gradients
are important factors (Purtle 1973; Mackintosh 1974). For typically peak at a stimulus that is further away from S
instance, the so-called ‘errorless’ discrimination training, than S + (Fig. 5); and closer S + and S produce gradients
where the intensity of the S is increased gradually, does whose peak is both higher and further away from S +
not seem to produce response biases (Terrace 1964, 1966). (Fig. 6).
An example of how testing can affect generalization is the
finding that biases tend to recede when the test phase is INTENSITY DIMENSIONS
very long (Crawford et al. 1980; Cheng et al. 1997). This
may be the effect of the subjects learning that the test It has long been noted that, in contrast to rearrange-
stimuli are not reinforced (see also Blough 1975). ment dimensions, intensity dimensions yield strongly
Maximum responding, relative to S +
Proportion of responses (S + = 1)
(a) 1.5
300% r35 = –0.47
P < 0.005
200% Bells
100% –3 –2 –1 0 1 2 3
0 5 10 15 20 25 30 35 40 +
Distance from S (arbitrary units)
(b) (b)
Peak displacement (nm)
Percentage of responses
r35 = –0.48
P < 0.005
0 0
0 5 10 15 20 25 30 35 40 70 75 S1 80 85 S2 90 95
+ –
Distance between S and S (nm) Intensity of noise (dB)
Figure 6. Dependence of response biases on the difference between Figure 7. (a) Intensity generalization in dogs conditioned to salivate
S + and S − . Data come from studies in which pigeons were trained to to a light, whistle or bell (position 0 on the horizontal axis).
discriminate between two monochromatic lights (sources in Test 6, Responding is expressed in terms of proportion of responses relative
Appendix 1). (a) Ratio between peak responding and responding to to the S + . Data from Razran (1949), summarizing over 250 studies
the positive stimulus. (b) Displacement of gradient peak from the S + . from Pavlov’s laboratory. (b) Intensity generalization in rats follow-
Peak responding, peak position and responding to S + are estimated ing training with two noise intensities, S1 =78 dB and S2 =87 dB. The
from fitted Gaussians. gradient is reversed when the weaker stimulus is the positive one
(data from Huff et al. 1975).
Tone Yes Yes Rat, R. norvegicus Pierrel & Sherman 1960
Yes/no* Yes Rat Thomas & Setzer 1972
Yes/no* Yes Guinea pig, Cavia porcellus Thomas & Setzer 1972
Yes Yes Rat Brennan & Riccio 1973
Yes Yes Rabbit, Oryctolagus cuniculus Scavio & Gormezano 1974
Bell Yes Yes Dog, Canis familiaris Razran 1949
Whistle Yes Yes Dog Razran 1949
White noise Yes Yes Rat Huff et al. 1975
Yes Yes Rat Zielinski & Jakubowska 1977
White light Yes Yes Dog Razran 1949
Yes/no* Yes Pigeon, C. livia Ernst et al. 1971
No Yes Pigeon Lawrence 1973
Yes Yes Rat Brown 1942
No data Yes Earthworm, Lumbricus terrestris Gilpin et al. 1978
Brightness of egg dummy Yes Yes Herring gull, L. argentatus Baerends 1982
Brightness of female dummy Yes Yes Butterfly, Eumenis semele Tinbergen et al. 1942
Yes Yes Glow-worm, P. splendidula Schaller & Schwalb 1961
No No Glow-worm, L. noctiluca Schaller & Schwalb 1961
Chemical concentration
Odour No data Yes Bee, A. mellifera Bhagavan & Smith 1997
Taste No data Yes Rat Tapper & Halpern 1968
1 1500
Intensity (N = 39) Sessions 1 and 2
Sessions 3 and 4
Number of responses
Proportion of studies
0 0
100–200 200–300 300–400 400 or more 60 70– 80 90+ 100
Maximum responding, relative to S+ (%) Sound intensity (dB)
Figure 8. Distribution of response biases along intensity and Figure 9. Data on sound intensity generalization in rats (Pierrel &
rearrangement dimensions. The strength of bias is measured as the Sherman 1960), collected over 6 days of testing in extinction.
ratio of observed maximum responding to responding to S + . Note Gradients from test days 1 and 2 are both monotonic (their average
that strength of bias is underestimated along intensity dimensions, is shown in the figure); gradients from subsequent days are not.
since responding often does not start decreasing within the probed
stimulus range. Data sources in Test 8, Appendix 1.
Third, laboratory data are often analysed by taking
gradients increases with better sampling, that is, with into account only experimentally controlled stimuli.
larger experimental groups (Test 9, P<0.01). However, responding to both the low and high ends of
Second, a number of studies reporting nonmonotonic intensity continua is likely to be influenced by factors
intensity generalization were designed to explore the beyond experimental control. Stimuli of high intensity,
effects of very long test sessions (e.g. Newlin et al. 1979; for example very loud sounds or very bright lights, are
Thomas et al. 1991, 1992, excluded from the analyses often avoided by animals. Similarly, stimuli of very low
above; this research is summarized in Thomas 1993). intensity (silence, a dark response key) are usually not
Under these conditions, gradients can change from reacted to. In addition to such generic reactions, specific
monotonic to peaked during the test (Fig. 9). Such a responses may interfere. For instance, Baerends (1982)
shape change appear analogous to the disappearing or reported that lightly coloured egg dummies are preferen-
waning of response biases in the course of long test tially retrieved by herring gulls, Larus argentatus, but this
sessions, along rearrangement dimensions (e.g. Crawford preference does not extend to white dummies. Studies
et al. 1980, Cheng et al. 1997). with related species have shown that gulls usually remove
white objects from the nest (a likely antipredatory bigger sizes. This is because the intensity and rearrange-
defence; Tinbergen et al. 1962; Baerends 1982). ment components have contrasting effects for bigger
stimuli (stimulating more receptors, but in a different
pattern), but work together in reducing responding to
smaller stimuli (stimulating fewer receptors and in a
(1) Gradients obtained along intensity dimensions different pattern). This prediction is confirmed by avail-
show larger response biases than rearrangement gradients able data (Test 10, P<0.05). The same data suggest that
(Fig. 8). size gradients are described better by Gaussian than by
(2) Many intensity gradients are monotonic (rather exponential curves (Test 11, P=0). Both conclusions
than peaked) over large ranges of intensity; more specifi- should be viewed as tentative in light of the small
cally: responding increases with intensity when S + is number of studies examined (N=7 and N=8, respect-
more intense than S (including when S is S + ‘turned ively). Another element in support of a rearrangement/
off’, e.g. a dark versus an illuminated key); and respond- intensity analysis of size dimensions is that size gradients
ing decreases with intensity when S + is less intense than show larger response biases than rearrangement gradients
S. (Test 8, P<0.02). In the small sample collected, biases
(3) Observed departures from monotonicity can, at towards bigger sizes appear comparable with biases along
least in some cases, be ascribed to: errors in sampling the intensity dimensions (Test 8, NS).
gradient (Test 9); long test sessions, leading to changes in Table 4 gives examples of response biases along size
gradient shape (Fig. 9); and pre-existing reactions (both dimensions. Similar regularities as those reported above
inherited and learned) to very intense stimuli (often for rearrangement dimensions seem to apply. For
avoided) or very weak ones (often ignored). instance, Weinberg (1973) found a stronger bias when S +
and S were closer in size.
VARIATION IN SIZE AND OTHER DIMENSIONS If a simple consideration of the arrangement and inten-
sity of stimulation is helpful in analysing size dimensions,
Along size dimensions, both the amount of stimulation it is not so for many other dimensions (Table 5). In some
and its arrangement on the sense organs vary. Consider a cases, we do not know enough about the sense organs
familiar stimulus of a given size. A bigger stimulus will act underlying perception along some dimensions, for
on more sensory cells than the familiar one, providing example floor tilt (Lyons et al. 1973) or arm movement
more stimulation. On the other hand, it will also provide (Hedges 1983; Dickinson & Hedges 1986). In other cases,
a different arrangement of stimulation. We can try to for example complex variations in shape (Ferraro &
understand size gradients as a trade-off between these two Grisham 1972; Wasserman et al. 1996), distinguishing
components. One immediate consequence is that the between intensity and nonintensity effects is simply not
gradient may be asymmetrical, higher on the side of sufficient. When we lack information about underlying
Table 5. Examples of generalization along dimensions that cannot be classified with the rearrangement/intensity
Dimension bias Species Source
Visual shape
Female dummy Yes Butterfly, A. paphia Magnus 1958
Polygon Yes Pigeon, C. livia Ferraro & Grisham 1972
Egg No Herring gull, L. argentatus Baerends 1982
Visual contrast Yes Herring gull Baerends 1982
Yes Chicks, Gallus g. domesticus Osorio et al. 1999
Length of movement Yes Human Hedges 1983
No data Human Dickinson & Hedges 1986
Click rate Yes Pigeon Farthing & Hearst 1972
Yes Rat, R. r. norvegicus Weiss & Schindler 1981
Flicker rate Yes Pigeon Sloane 1964
Yes Butterfly, A. paphia Magnus 1958
Floor tilt Yes Pigeon Lyons et al. 1973
Yes Pigeon Riccio et al. 1966
Calls/songs Yes Monkey, Callimico goeldii Masataka 1983
Yes Blackbird, Turdus merula Wolffgramm & Todt 1982
Human faces Yes Human Rhodes 1996; Rhodes &
Zebrowitz 2002
Yes Chickens Ghirlanda et al. 2002
Checkerboard patterns Yes Human McLaren et al. 1995
Icon sets Yes Human Wills & Mackintosh 1998
Drawings of rotated objects No data Pigeon Wasserman et al. 1996
‘Aggressiveness’ of verbal stimuli No data Human Buss 1961; 1962
‘Fearfulness’ of snake pictures No data Human Buss et al. 1968
sensory processes, we can try to infer some characteristics (2) Response biases have been found along all
of such processes by analysing the experimental data in dimensions investigated so far.
the light of what we know from other, better studied (3) Generalization along size dimensions is influenced
dimensions. For example, Dickinson & Hedges (1986) let by both intensity and rearrangement effects: size gener-
blindfolded humans move a sliding handle for a given alization gradients are typically peaked; they are better
distance, and tested them with movements of different approximated by Gaussian than by exponential functions
lengths by asking if they matched the training one. Their (Test 11); they exhibit larger biases than gradients along
results seem to suggest that length of movement is per- rearrangement dimensions, and comparable to intensity
ceived as a rearrangement dimension. Monotonic gradi- gradients (Test 8); and when S is the absence of S + ,
ents have been reported along dimensions that cannot be responding is biased towards bigger sizes (Test 10).
readily identified as intensity ones (Ghirlanda & Enquist,
1999); for instance rate of stimulus presentation (Magnus
1958; Weiss & Schindler 1981) or femininity/masculinity FACTORS AFFECTING THE AMOUNT OF
of human faces (Enquist et al. 2002b). The case of changes GENERALIZATION
in shape is particularly interesting. For instance, Magnus A fundamental question is what regulates the amount of
(1958), studying Argynnis paphia butterflies, found that generalization along a dimension. The width of a peaked
certain shapes attract males more than female-shaped gradient provides a measure of the amount of general-
dummies, but also that males respond very little to other ization, and one factor related to gradient width is
shapes (Fig. 10). Data of this kind can be used to explore discriminability. Guttman & Kalish (1956) noted that
how similarity is perceived across species. generalization is measured by the change in behaviour
arising from a change in stimulation, whereas discrimi-
Conclusions nability is defined as the change in stimulation necessary
to yield a behavioural change (cf. Lashley & Wade 1946).
(1) Both peaked and monotonic gradients have been They thus suggested that generalization and discrimi-
found along dimensions where both the intensity and the nability should be inversely related, but because they
arrangement of stimulation vary. lacked reliable data on discriminability they failed to
0.2 0.2
(a) (b)
0.15 0.15
0.1 0.1
0.05 0.05
Proportion of responses
0 0
–20 –10 0 10 20 500 520 540 560 580
Deviation from middle wavelength (nm) Wavelength (nm)
0.1 0.5
(c) (d)
0 0
4 8 12 16 65 72 79 86
Stimulus number Tone intensity (dB)
Figure 11. Four examples of generalization after experience with many stimuli. x: Training stimuli; C: test stimuli. Subjects had the same
experience with all training stimuli. (a) Two-stimulus training can yield gradients with both one or two peaks depending on the difference
between stimuli (data from Blough 1969, pigeon, wavelength). (b) Three-stimulus training can produce less responding to the intermediate
stimulus (data from Kalish & Guttman 1959, pigeon, wavelength). (c) In this case no response biases are found after extensive experience with
a set of contiguous stimuli (data from Guttman 1965, pigeon, wavelength). (d) Responding can be stronger to more intense stimuli, even
when all stimuli were followed by the same consequences during training (data from Scavio & Gormezano 1974, rabbit, tone intensity).
(broader gradients): if the S + ’s are close to each other, tions of generalization typically make direct reference to
a flat or almost flat gradient develops over the range individual learning (Kalish 1969). These attitudes prob-
covered by the training stimuli; departing from the S + on ably stem from early ideas within the two disciplines:
both sides, the gradient falls down (Fig. 11a, b, c); the classical psychologists often ignored innate determinants
gradient shows multiple peaks if the distance between of behaviour (Watson 1924), and early ethologists
training stimuli is sufficiently increased (Fig. 11a); and claimed that inherited and learned behaviour are
responding to all training stimuli is the same or similar governed by different mechanisms (Von Uexkull 1928;
(small biases, Fig. 11b, c). Lorenz 1937).
(4) Along intensity dimensions, substantial biases in Although such rigid ideas have been abandoned
responding persist even when two or more stimuli are (Hogan & Bolhuis 1994; Bolhuis & Hogan 1999), the idea
equally rewarded. The more intense S + ’s elicit stronger that inherited and learned behaviour generalize differ-
reactions (Fig. 11d; Kessen 1953; Bass 1958; Murray & ently seems to have survived longer (Baerends & Krujit
Kohfeld 1965; Birkimer & James 1967; Blue et al. 1971). 1973; Dawkins & Guilford 1995). In particular, it has
been claimed that innate behaviour results in ‘open-
GENERALIZATION OF INHERITED AND LEARNED ended’ generalization (i.e. monotonic gradients), while
BEHAVIOUR individual learning does not (Baerends & Krujit 1973;
Hogan et al. 1975; Lorenz 1981). The data do not support
Both genetically inherited and individually learned this statement. Note first that ethological studies of super-
responses generalize to novel stimuli (Tables 2–5). Within normality are not always about behaviour that is inde-
ethology, however, there has been a tendency to separate pendent of individual learning (e.g. egg retrieval in gulls,
the study of innate and learned generalization. For Baerends 1982). Furthermore, some dimensions cannot
instance, ethologists have often claimed that super- by their nature result in open-ended generalization,
normality and peak shift are distinct phenomena, at the regardless of ontogeny of behaviour. For instance, the
same time that similarities have been acknowledged response of male Haplochromis burtoni cichlids to the
(Baerends & Krujit 1973; Hogan et al. 1975; Staddon orientation of the head stripe (Heiligenberg et al. 1972)
1975; Dawkins & Guilford 1995). Within psychology, the can only come back to its original value after the stripe
question is seldom addressed explicitly; however, defini- has turned a full circle.
Pure rearrangement dimensions are rare in ethological rearrangement of stimulation appears valid for both
studies, as test stimuli most often vary in complex ways. inherited and learned behaviour; and the claim that
One exception is the just cited study by Heiligenberg biases in inherited behaviour are open ended (mono-
et al. (1972), where the supernormal effect of a rotated tonic) whereas biases in learned behaviour are limited is
head stripe decreases after only a 90 rotation. In other unsupported.
cases a rearrangement dimension can be defined by vary-
ing only one characteristic of a complex stimulus. For DISCUSSION
instance, many of the stimuli in Fig. 10 are rectangles of
different length and constant area. It is clear that the Empirical data gathered in about 100 years of research
preferences of A. paphia males along this rearrangement establish generalization as a fundamental behavioural
dimension are not open ended. The case of colour is more phenomenon, whose basic characteristics appear univer-
complex. First, different light spectra of the same physical sal. Birds and mammals are most studied, but fish, insects,
intensity can elicit different amounts of activation in amphibians and reptiles generalize in the same ways. It
receptors (and differences between species exist). Second, seems to matter little, for generalization, whether a
physical intensity is seldom controlled for in ethological behaviour has been acquired phylogenetically or through
studies. Anyway, ethologists report that dummies of individual learning, and the nature of sensory con-
unnatural colour can be both more and less effective in tinua is an important determinant of gradient shape.
eliciting an innate behaviour (Magnus 1958; Schaller & Furthermore, generalization seems little dependent on
Schwalb 1961). In some cases the most effective colour is the context in which a given behaviour is used. That is, if
clearly not at the extremes of the spectrum (L. noctiluca a discrimination between green and red is established,
glow-worms: Schaller & Schwalb 1961; herring gulls: generalization to other colours will follow independent of
Baerends 1982) or closely matches the natural colour whether the discrimination is about food items or poten-
(A. paphia butterflies: Magnus 1958). tial partners (e.g. Ghirlanda et al. 2002), or on whether
Open-ended generalization of innate behaviour has behaviour is performed by children to obtain ‘points’ or
been reported almost exclusively along intensity dimen- by pigeons pecking for food. The generality of the
sions, for instance by Tinbergen et al. (1942, brightness of findings reviewed suggests that generalization arises
female butterfly dummy), and Schaller & Schwalb (1961, from basic and universal characteristics of behaviour
brightness of glow-worm female dummy), with a few mechanisms (cf. Hogan 1994).
exceptions along size dimensions (where strong biases are This review has focused on empirical findings, but we
expected, see above). For example, Baerends (1982) end with some theoretical considerations. A number of
showed that oversized eggs are preferred by incubating attempts have been made to understand the causes of
gulls up to giant sizes. Magnus (1958) provided mixed generalization (reviewed in Kalish 1969; Mackintosh
evidence, showing that male A. paphia butterflies prefer 1974). Theorizing about mechanisms has considered
four-fold enlarged female dummies to dummies twice the properties of stimuli, sense organs and neural processing,
normal size when the dummies are stationary, but not and how these factors interact. Physical similarity
when they imitate flight. Similarly, Schaller & Schwalb between stimuli is one cause of generalization. Stimuli
(1961) reported that in glow-worms, P. splendidula males may be similar because they share common components,
prefer dummy female lanterns four times bigger than and generalization may follow because novel stimuli
normal, but L. noctiluca males do not. There are further include components also present in familiar stimuli (e.g.
reports that the supernormal effect of bigger stimuli often Thorndike 1911; Guthrie 1930, 1935; Blough 1975;
quickly ceases. One instance is the response of male Rescorla 1976). However, not all stimuli are made up of
P. splendidula and L. noctiluca to circular lights, which ‘components’ in this sense (e.g. light and sound spectra).
declines for circles several times bigger than the female’s In general, what is similar and different to an organism
lantern (Schaller & Schwalb 1961). The same authors depends also on properties of receptors and the organiz-
showed that L. noctiluca males prefer a dummy lantern ation of sense organs (including early processing of neural
containing three horizontal segments to one with six signals within sense organs). These factors determine how
(two is normal). Koehler & Zagarus (1937) found that physical similarity translates into similarity of nervous
ringed plovers, Charadrius hiaticula, retrieve eggs weigh- signals to the brain, and will thus contribute to general-
ing 17 g, but not those above 35 g (normal eggs weigh ization. Receptors and sense organs have often been
about 11.5 g). In this last case individual learning may ignored, especially in contemporary psychological
play a role (cf. Baerends 1982). Finally, Ewert (1980) has models (but not always in early ones, Hull 1943;
shown that both naïve and experienced toads, Bufo bufo, Schlosberg & Solomon 1943; Hebb 1949). By considering
clearly prefer catching objects within a restricted size them it may be possible to account for both rearrange-
range. ment generalization and intensity generalization within
the same model, by recognizing that similarity depends
upon which receptors are stimulated and to what degree
(Ghirlanda 2002).
We could find no difference between generalization of Generalization is also modulated centrally in the
genetically inherited and individually learned behaviour, nervous system. Suggestion about how this occurs vary in
with respect to either gradient shape or response biases: detail, but the core idea is that processing of stimuli that
the distinction between the effects of intensity and are distinctly different can rely, at least to some extent, on
Biases along
dimensions Monotonic Monotonic
Bell including increase decrease Stronger
Theory shape S + and S − (S + >S − ) (S + <S − ) biases Source
Early ideas suggesting overlap or interactions Consequences unclear or not studied Pavlov 1927; Hebb 1949; Horne 1965;
among nerve cells* Thompson 1965; Lorenz 1981; Baerends 1982
Gradient interaction, original formulation Assumed Yes† No No No Spence 1937; Hull 1943
Gradient interaction, later developments Assumed Yes† Assumed No Assumed Hull 1949; Perkins 1953; Logan 1954
Exponential generalization in ‘psychological No No No No No Shepard 1987; Cheng et al. 1997
Gaussian generalization in ‘psychological spaces’§ Assumed Yes No No May be assumed Shepard 1987
Configural theory Assumed Yes No No May be assumed Pearce 1987
Blough’s model, original formulation Assumed Yes No No May be assumed Blough 1975
Blough’s model, reinterpretation* Yes Yes Yes Yes Yes Ghirlanda & Enquist 1999; Ghirlanda 2002
Feed forward networks, with idealized inputs‡ No/assumed No/yes No No No Gluck 1991; Pearce 1994; S. Ghirlanda,
unpublished data
Feed forward networks, with realistic inputs* Yes Yes Yes Yes Yes Ghirlanda & Enquist 1998
Overlap theory, based on receptor activations* Yes Yes Yes Yes Yes Ghirlanda & Enquist 1999
Derivations for some of these predictions not in the original sources can be found in Ghirlanda & Enquist 1999; Ghirlanda 2002.
*Empirical knowledge about the nervous system is used.
†Responding underestimated.
‡See cited papers for details of assumed stimulus representations.
§For possible improvements, see Ennis (1988), Shepard (1988), Staddon & Reid (1990), Shepard (1990).
the same nerve cells and connections (Pavlov 1927; Hebb these models provide a powerful explanation for how
1949; Horne 1965; Thompson 1965; Baerends & Krujit generalization is generated, including the consequences
1973; Blough 1975; Lorenz 1981). Distinct stimuli may of learning (Blough 1975; Ghirlanda & Enquist 1998;
thus come to elicit similar responses. Generalization of Ghirlanda 2002).
this kind is strongly dependent on experience (includ-
ing the species’ experience, coded in the genes). Often
generalization is substantial along dimensions with Acknowledgments
which the organism has little experience (Peterson 1962; We thank Susanne Stenius for valuable help in collecting
Rubel & Rosenthal 1975; Kerr et al. 1979). Along references. Ken Cheng, Björn Forkman, Birgitta Sillén-
familiar dimensions organisms generalize less: latent Tullberg and the anonymous referees provided helpful
learning, perceptual learning and discriminations comments on the manuscript. Laura, Giovanni and
between similar stimuli all decrease generalization Daniele Ghirlanda made the acquisition of many data
(Mackintosh et al. 1991; Bennett et al. 1994; see also sets possible by kindly allowing use of computer equip-
above). Discrimination learning, in particular, can sub- ment. S.G. thanks the Institute for the Science and
stantially lower generalization, presumably up to sensory Technologies of Cognition, National Research Council,
limits. Rome, for kind hospitality during some of the time spent
There is of course also a functional side to general- working on the manuscript. This work has been partly
ization. Evolution has favoured those behaviour mech- funded by Marianne och Marcus Wallenberg Stiftelse and
anisms that are ‘intelligent’ towards the real world. For by the Jubileumsfund of the Bank of Sweden.
instance, stimuli that are similar to one another often
share some causal relation with events in the outside
