cortex 45 (2009) 80–92
available at www.sciencedirect.com
journal homepage: www.elsevier.com/locate/cortex
Special issue: Research report
Predictive coding of music – Brain responses to rhythmic
incongruity
Peter Vuusta,b,*, Leif Ostergaarda, Karen Johanne Pallesena,c,
Christopher Baileya,c,d and Andreas Roepstorff a,e
a
Centre of Functionally Integrative Neuroscience, University of Aarhus, Denmark
Royal Academy of Music, Aarhus, Denmark
c
BioMag Laboratory, Helsinki Brain Research Centre, Helsinki University Central Hospital, Finland
d
PET Center, Aarhus University Hospital, Denmark
e
Institute of Anthropology, Linguistics and Archaeology, University of Aarhus, Denmark
b
article info
abstract
Article history:
During the last decades, models of music processing in the brain have mainly discussed the
Received 11 January 2007
specificity of brain modules involved in processing different musical components. We argue
Reviewed 29 May 2007
that predictive coding offers an explanatory framework for functional integration in musical
Revised 20 July 2007
processing. Further, we provide empirical evidence for such a network in the analysis of
Accepted 7 May 2008
event-related MEG-components to rhythmic incongruence in the context of strong metric
Published online 14 November 2008
anticipation. This is seen in a mismatch negativity (MMNm) and a subsequent P3am
component, which have the properties of an error term and a subsequent evaluation in
Keywords:
a predictive coding framework. There were both quantitative and qualitative differences in
Music
the evoked responses in expert jazz musicians compared with rhythmically unskilled non-
MEG
musicians. We propose that these differences trace a functional adaptation and/or a genetic
Meter
pre-disposition in experts which allows for a more precise rhythmic prediction.
Predictive coding
ª 2008 Elsevier Srl. All rights reserved.
MMN
1.
Introduction
Models of music processing in the brain have primarily
discussed specificity of brain modules involved in processing musical components. In contrast to language processing, primarily located in the left hemisphere, music
processing was originally suggested to be right-lateralized
(Luria et al., 1965; Signoret et al., 1987). A more detailed
modular viewpoint was recently expressed by Peretz and
Coltheart (2003) who demonstrated that anatomically
distinct sub-modules were not necessarily confined to one
hemisphere to process different aspects of music. The Peretz–Coltheart model, based mainly on lesion studies and on
studies of acquired and congenital amusia, emphasizes
modular specificity at the expense of brain integration. It
adequately accounts for certain aspects of neural musical
processing particularly processing of pitch (Liegeois-Chauvel
et al., 1998; Mendez, 2001; Peretz et al., 1994). However, it
fails, Peretz et al. acknowledge, to fully account for processing of rhythm and meter. This appears problematic, as
rhythm and meter are constitutive elements of musical
structure, and influence how music is perceived and
* Corresponding author. Centre for Functionally Integrative Neuroscience, Aarhus University Hospital, Building 30, Norrebrogade, 8000
Aarhus C, Denmark.
E-mail address: pv@pet.auh.dk (P. Vuust).
0010-9452/$ – see front matter ª 2008 Elsevier Srl. All rights reserved.
doi:10.1016/j.cortex.2008.05.014
cortex 45 (2009) 80–92
understood (Benjamin, 1984; Dalla and Peretz, 2005;
Schmuckler and Boltz, 1994).
Recently, Friston (2002) provided a promising model of brain
function, in which predictive coding, as a central principle of
brain organization, provides a link between segregation and
integration (for similar viewpoints see Shepard, 2001; Tononi and
Edelman, 1998). The model proposes that the interaction
between segregation and integration may be described by
predictive coding, interpreted in a hierarchical brain organization
whereby lower level brain areas estimate predictions of their
expected input based on contextual information through backward connections from higher level areas. A comparison
between prediction and actual input produces an error term that,
if sufficiently large, will be fed forward to call for an update of the
model. This generates a recursive process, which aims at minimizing the difference between input and prediction. As representational capacity of any neuronal assembly in this model is
dynamic and context sensitive, it addresses the issue of top–
down processing (Frith and Dolan, 1997; Roepstorff and Frith,
2004).
The predictive coding model entails that the brain
constantly tries to extract structural regularities from the
surroundings. This concept is well-established in psychology
and neurobiology (Mehta, 2001; Schultz and Dickinson, 2000),
and has been successfully applied in several fields, e.g., motor
control and social interaction (Blakemore et al., 1998; Wolpert
et al., 2003), object perception (Kersten et al., 2004) and visual
integration (Rao and Ballard, 1999). In this study, we have
employed magnetoencephalography (MEG) to test two
hypotheses: (1) that neuronal markers of rhythmic incongruities behave in accordance with a predictive coding framework, (2) that musical competence affects the composition of
the neuronal networks involved in the processing of rhythm
by affecting the neuronal integration.
The human auditory system appears to segregate the
auditory environment into meaningful streams according to
specific rules, and this forms the basis of a prediction of the
near auditory future (Bregman, 1990). As a special case, the
rhythmic regularity in music is generated by expectations
created in different layers of the musical structure (Bharucha
and Stoeckig, 1986; Meyer, 1956; Sloboda, 1985). This depends
critically on the timing structure provided by the meter, which
is based on a fundamental opposition between strong and
weak beats (see, e.g., Cooper and Meyer, 1960; Vuust, 2000).
Meter provides the listener with a temporal, hierarchical
expectancy structure, underlying the perception of music, in
which each musical time-point encompasses a conjoint
prediction of timing and strength (Large and Kolen, 1994).
When metric expectancy structure is violated, it may elicit
strong perceptual responses including sensation of tension
(Vuust et al., 2006), shift of attention (Jones and Boltz, 1989) and
laughter (Huron, 2004). Violations of meter, especially in music
favouring a regular beat, therefore appear particularly well
suited as substrate for critical examination of the predictive
coding model of brain function. If the predictive coding theory
is correct, we hypothesize that meter violation generates an
error term at the neural level, the size of which depends on
degree of violation. If the violation is sufficiently large, it may
cause a subsequent evaluation that involves higher level
neuronal structures. The first error term should occur locally,
81
while the putative subsequent evaluation would involve
integration across hierarchies of neuronal processing. We
therefore created rhythm sequences of increasing rhythmic
incongruence and measured brain responses with MEG to test
the hypothesis that pre-attentive neural responses to
increasing rhythmical incongruity could be identified, and
would be congruent with an error term and subsequent
evaluation.
We hypothesized that rhythmic incongruities would elicit
the magnetic counterpart of the mismatch negativity (MMNm),
an event-related field (ERF), peaking around 100–200 ms from
change onset, an index of pre-attentive detection of change in
some repetitive aspect of auditory stimulation (Naatanen,
1992), accompanied by a later component the P3a: usually
associated with the evaluation of that change for subsequent
behavioral action and believed to indicate activity in a network which contains frontal, temporal and parietal sources
(Friedman et al., 2001).
According to Winkler et al. (1996), MMN reflects a modification of the pre-attentive model of the acoustic environment. This is caused by the incorporation of a new auditory
event that mismatches the actual inferences of the model
(the model adjustment hypothesis). This is highly compatible with the predictive coding theory which implies that the
error term to unexpected events depends on an interaction
between the objective differences in stimulus structure and
the degree of detail in the expectancy structure. Musicians
are known to have longer and more precise temporal integration windows compared to non-musicians (Russeler et al.,
2001), more fine-grained representation of temporal structure (Jongsma et al., 2004) and higher sensitivity when
detecting small time changes embedded within simple
rhythmic patterns (Jones and Yee, 1997). If the predictive
coding theory is correct, then the more detailed expectancy
structure in musicians should influence both neuronal
markers of the prediction error and the neuronal markers of
evaluation. We therefore compared rhythmically unskilled
non-musicians with expert jazz musicians. Jazz musicians
use challenging rhythmic material in their musical performance and are therefore ideal candidates for identifying
putative competence dependent differences in the processing of metric violations. We have previously described
a leftward lateralization in musicians compared to nonmusicians when exposed to rhythmically challenging material (Vuust et al., 2005). We here extend the analysis to the
P3a component and demonstrate how the findings may be
explained by a predictive coding framework.
2.
Materials and methods
2.1.
Subjects, stimuli and task
Nine expert jazz musicians (8 men and 1 woman; mean
age ¼ 27.22, SE ¼ 1.68; from the Sibelius Academy of Music,
Helsinki, Finland), scoring more than 14 in a modified version
of the rhythm imitation test employed at the entry examination for Danish music conservatories, and eight rhythmically
unskilled non-musicians (6 men and 2 women; mean
age ¼ 24.5, SE ¼ 0.87), scoring less than 3 in the rhythm test,
82
cortex 45 (2009) 80–92
participated in this study, approved by The Ethical Committee
of Helsinki University Central Hospital.
The rhythm imitation test consists of 30 rhythm sequences
falling into three categories presented in a semi-randomized
manner: (A) quarter notes and eighth notes, the first note on
the down-beat of bar 1 (three sequences), (B) syncopated
versions of the above or triplets/16th notes (10 sequences), (C)
both 16th notes and triplets or metric displacement (17
sequences).
The subjects listened to drum beat sequences (Fig. 1): sIda
simple 4-beat rock rhythm; sIIdan alteration of sI introducing
a syncopation (metric displacement) which breaks the metric
expectancy by replacing a weak beat with a strong beat,
without interfering with the music pulse; sIIIdan alteration of
sI introducing a beat, incongruent with the underlying
temporal grid or meter, hence a stronger violation of musical
expectancy. sII may be described as a metrical and musically
acceptable syncopation breaking the metric expectancy by
replacing a weak beat with a strong beat, challenging the
sense of meter without interfering with the music pulse;
a well-known stylistic feature in jazz (Kernfeld, 2002). In
contrast, the break in sIII from SOAs of 312.5 ms to an SOA of
105 ms did not coincide with any normal subdivision of the
beat (triplets or 16th notes) and could therefore be described
as a non-metrical and musically unacceptable interruption/
violation easily detectable for all subjects.
Using MEG, we recorded brain responses to 600 occurrences
of the rhythm sequences pseudo-randomized as follows
[frequency of occurrence (f.o.) in parentheses]: sI (30%), sII
(30%), sIII (30%). Subjects responded by button presses to the
occurrence of either sIu or sId: variations of sI in which the last
snare drum beat of the sequence had been tuned up (sIu, f.o.:
5%) or down (sId, f.o.: 5%). This directed their attention to the
last part of the rhythm sequences, while the rhythmic deviations occurred in the middle part. Prior to recording, subjects
practiced the task and performed a handedness test (Oldfield,
1971). Stimuli were delivered with Presentation (Neurobehavioral Systems, Inc.) through plastic tubes and earpieces.
Latency from sound delivery to earplugs (18 2 ms) was subtracted from recorded ERF-latency. Button press responses
during recordings were observed to ensure that subjects were
performing the discrimination task adequately. After recordings subjects filled in a written questionnaire asking them to
rate the stimuli according to how disturbing the sequences
appeared, how familiar they seemed and how likely they were
to appear in contemporary music. After this, they were verbally
debriefed (recorded on tape).
2.2.
Data acquisition
Neuromagnetic signals were recorded in a magnetically
shielded room, using a 306-channel VectorView whole-head
MEG system (Elekta-Neuromag, Finland). The 700 ms epochs
were set according to stimulus type: A, 4900–5600 ms; B,
4687.5–5387.5 ms; and C, 4380–5080 ms, so that the deviant
beat occurred always 100 ms after epoch start, defining
a ‘‘time-zero’’ after which all stimuli were physically identical
for two bars. A single trial was discarded from the average if
any number of the following rejection criteria were met: (i)
rectified EOG signal exceeded 100 mV; (ii) rectified MEG of any
gradiometer exceeded 3000 fT/cm or (iii) slope of any gradiometer exceeded 6000 fT(/cm)/s. Averaged responses were
Rhythm Sequences
sI
10
sII
10
sIII
9.7925 s
sI
HH
sII
BD/HH
312.5 ms
HH
SD/HH
sIII
BD/HH
312.5 ms
HH
HH: Hi-hat
SD: Snare Drum
BD: Bass Drum
SD/HH BD/HH
HH
105 ms
Fig. 1 – Stimuli. Top, sequences sI, sII, sIII, prepared in Cubase SX (Steinberg), using realistic broadband sounds: bass drum
(BD), snare drum (SD) and hi hat (HH) from LM-7. Arrows indicate time of recording. Recording window was set to 100 ms
before and 600 ms after the arrow. Bottom, wave-forms and exact timing of the sounds for each of the sequences. SI is
a simple four-beat rock rhythm (functions as a standard in the present design). SII is an alteration of sI that introduces
a metric displacement (a subtle rhythmic deviation). SIII is an alteration of sI that introduces a non-rhythmic metric
violation (a strong rhythmic deviation).
83
cortex 45 (2009) 80–92
used in subsequent analysis filtered with a pass band of
2–30 Hz using a standard zero-phase FIR filter.
Two bipolar EOG channels were used to measure blinking
and saccadic eye movements. Electrodes were placed above the
left eyebrow, below the lower left eyelid (blinking), and on
either brow (saccades). Channels were referenced to the tip of
the nose and grounded on the right cheek. Subject’s head
position relative to the sensors was measured before and after
each session by four head position coils. The head coordinate
system was defined by three landmarks on the subject’s head
(left and right preauricular points and nasion) whose relative
positions were determined using an Isotrak 3D digitizer
(Polhemus, Inc., Germany). To relate the dipole fitting locations
to macroanatomical brain structures, magnetic resonance
images (1 mm3 voxels) were obtained from one subject from
each of the two groups, using a 1.5 T Siemens Sonata scanner.
2.3.
Data analysis
Responses were band pass filtered (2–30 Hz), and a baseline of
50 to 0 ms before the deviation (sII and sIII)/standard (sI) was
used. ERFs were analyzed using a subset of 31 pairs of
orthogonal planar gradiometers over each hemisphere.
Magnetometer channels were discarded due to their poor
location specificity and lower signal-to-noise ratio. We
defined the mean gradient amplitude (MGA) as follows:
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
o
Xn
2
2
ðdBz=dxÞ þðdBz=dyÞ
MGA ¼
1=N
i
i
where {.}i refers to the ith gradiometer pair in a selection of N.
MGA is a measure of the instantaneous amplitude of the
tangential gradients of the magnetic field over a selection of N
gradiometer pairs. The MGA can (1) be used as a marker of
salient input in the brain regions covered by the gradiometer
subset used, and (2) reasonably be compared across subjects
over a sufficiently large channel selection. For each subject in
each hemisphere, brain responses to sI, sII and sIII were
measured by comparing the maxima of the MGA (MGAmax) in
the intervals (100–170 ms) and (170–250 ms). For both intervals,
attempts to estimate the parameters of an equivalent current
dipole (ECD) were made separately for each subject, for each
hemisphere and condition, at the latency of MGAmax using
a spherical head model (Neuromag software xfit). The software
gives measures for location, orientation and amplitude of each
dipole, as well as test values related to the confidence limits
attributable to the estimated parameters. A dipolar pattern,
goodness of fit (g.o.f.) > 60%, confidence volume < 500 mm3, and
P-value < 0.01, were used as criteria for accepting a dipole. The
large number of channels (62 in each hemisphere) was chosen to
yield a robust source estimate whilst allowing individual
differences in brain shape and head position in the scanner. Bias
was minimized by use of the same channels for all subjects.
Dipole amplitudes (A) were used to calculate an asymmetry index as (Aleft Aright)/(Aleft þ Aright). Furthermore, asymmetry indices for both the early (100–170 ms) and the late peak
(170–250 ms) were calculated on basis of the MGAmaxs as
(MGAmax(left) MGAmax(right))/(MGAmax(left) þ MGAmax(right)).
Locations of MMNms and P3ams were compared (threeway ANOVA) to determine whether different or identical
neuronal populations were likely to generate the two
responses.
Statistical tests were applied to examine group differences
with respect to lateralization, latency and amplitude of the
MMNm and the P3am.
3.
Results
3.1.
Behavioral data
Expert musicians scored higher in the rhythm imitation test
than rhythmically unskilled non-musicians (P < 0.001,
Z ¼ 3.54, Table 1). The subjects rated sI–sIII as increasingly
disturbing [sI < sII: P ¼ 0.08 (non-significant), Z ¼ 1.74; sII < sIII:
P ¼ 0.001, Z ¼ 3.3], decreasingly familiar (sI > sII: P ¼ 0.03,
Z ¼ 2.18; sII > sIII: P ¼ 0.001, Z ¼ 3.4) and decreasingly likely to
appear in contemporary music (sI > sII: P ¼ 0.004, Z ¼ 2.18;
sII > sIII: P ¼ 0.001, Z ¼ 3.4), indicating that the subjects experienced the increasing incongruity of the sequences.
Experts and rhythmically unskilled non-musicians did not
differ in their rating of sI–sIII with respect to whether
sequences were disturbing or likely to appear in contemporary music. Experts however reported to be more familiar with
sI and sII but not sIII than the rhythmically unskilled nonmusicians (sI: P ¼ 0.02, Z ¼ 2.27; sII: P ¼ 0.001, Z ¼ 3.18; sIII:
P ¼ 0.13, Z ¼ 1.5). This probably reflects that jazz musicians are
highly familiar with syncopation, and confirms the fact that
sIII is not musically acceptable.
Table 1 – Summary statistics of the data collected in the
rhythm imitation test and the questionnaire.
Behavioral results
Scale
Rhythm imitation test
All
1–30
Experts Unskilled
19.6 (1.1)
1.3 (0.3)
Questionnaire ratings
Task difficulty
1–5
2.0 (0.2) 1.6 (0.2)
2.5 (0.3)
Disturbingness of stimuli (2 to 2)
SI
1.4 (0.2) 1.4 (0.7) 1.3 (0.4)
SII
0.8 (0.3) 1.0 (0.4) 0.5 (0.5)
SIII
0.6 (0.3) 0.7 (0.4)
0.6 (0.5)
Familiarity with stimuli
SI
SII
SIII
1–5
Likelihood of appearance 1–5
in contemporary
music:
SI
SII
SIII
Guess of total number of
Stimuli (correct
answer ¼ 5)
4.5 (0.2)
3.9 (0.3)
2.6 (0.3)
4.9 (0.1)
4.6 (0.2)
3.1 (0.4)
4.0 (0.3)
3.0 (0.4)
2.1 (0.5)
4.8 (0.1)
4.0 (0.2)
2.3 (0.3)
5.0 (0.0)
4.0 (0.3)
2.1 (0.4)
4.6 (0.5)
4.0 (0.3)
2.5 (1.3)
5.0 (0.3)
5.3 (0.5)
4.6 (0.4)
Numbers in parentheses denote standard errors of mean.
84
cortex 45 (2009) 80–92
Interestingly, when asked to guess the total number of
stimuli, experts did not perform significantly better than
rhythmically unskilled non-musicians. This may indicate that
the experts focused just as much on the task (discriminating
sIu or sId from sI) as did rhythmically unskilled non-musicians.
The results in the following section are presented in the
following order: Waveform analyses, MGA-analyses, source
localization and analyses of the latencies of the ERFcomponents.
(Fig. 5, lower right panel). The response to the congruent beat
in sI appeared to be slightly right-lateralized across subjects
(P ¼ 0.14) and did not differ between groups (P ¼ 0.73).
No difference between groups in lateralization of the P3am
was found. In fact, there was a significant right lateralization
of the P3am across all subjects and sequences (P ¼ 0.01,
Z ¼ 2.6).
3.3.
Dipoles
3.2.
3.3.1.
MMNm
MEG data
ERFs to sII and sIII (Figs. 2, 3) showed peaks bilaterally in the
intervals 100–170 and 170–250 ms. In keeping with earlier
studies we interpret these components as an MMNm (Ford
and Hillyard, 1981; Naatanen, 1992) and a subsequent P3am
(Escera et al., 2000; Escera et al., 1998; Jongsma et al., 2004).
Increased rhythmic incongruence caused greater neuronal
responses (Figs. 3 and 4, Table 2), as confirmed by a four-way
ANOVA on the MGAs; factors: group of subjects (experts/
rhythmically unskilled), sequence (sI/sII/sIII), hemisphere
(left/right), component (MMNm/P3am) (F ¼ 165, P < 0.000001),
regardless of hemisphere (interaction between sequence and
hemisphere: F ¼ 0.67, P ¼ 0.51).
The magnitude of the ERP responses was larger for experts
to rhythmic incongruity than for rhythmically unskilled nonmusicians (F ¼ 54.88, P < 0.000001), and there was a significant
interaction between group and sequence (F ¼ 7.11, P ¼ 0.001),
mainly driven by larger neuronal response in experts to sII
(Figs. 2, 3 and 4).
Experts showed predominantly left hemispheric responses
to sIII and sII as opposed to the rhythmically unskilled
non-musicians’ more right-lateralized response (Fig. 3), as
indicated by significant interaction between group and
hemisphere (F ¼ 5.84, P < 0.05) and the asymmetry index
calculated for the MMNms (Fig. 5, Table 2) to the rhythmic
incongruence in sII and sIII [F ¼ 18.75, P ¼ 0.001, two-way
ANOVA, factors: group of subjects and type of incongruence
(sII/sIII)]. This difference in lateralization was more
pronounced for the subtle incongruence in sII than for sIII
For sIII, a single dipole source was estimated from the data
using a restricted inverse solution in all subjects. P3am dipoles
were found in all experts in both hemispheres but only for
three rhythmically unskilled non-musicians in each hemisphere (Table 3). In the interval 100–150 ms dipolar sources
(g.o.f 86%, SE ¼ 2, volume ¼ 46 mm2, SE ¼ 8) resided in the
temporal cortex, specifically in the transverse temporal gyrus,
near the primary auditory cortex in the two subjects that
underwent anatomical imaging (Fig. 6).
Dipole amplitudes of the MMN to sIII confirmed stronger
neural response in experts than rhythmically unskilled
subjects (F ¼ 8.17, P < 0.01, two-way ANOVA on the MGAs,
factors: group of subjects and hemisphere), no overall effect of
hemisphere (F ¼ 1.06, P ¼ 0.31) but interaction between group
and hemisphere (F ¼ 8.46, P < 0.01).
Dipole amplitudes in the left hemisphere were significantly
larger in experts than in rhythmically unskilled nonmusicians (Table 3, P < 0.01) whereas no significant dipole
amplitude difference between groups was observed in the
right hemisphere. The asymmetry index confirmed that
neural response to sIII was predominantly left-lateralized
in experts, but right-lateralized in rhythmically unskilled
non-musicians (Fig. 5) (experts: P < 0.05; rhythmically
unskilled non-musicians: P < 0.05; difference between groups:
P < 0.002). Left hemisphere dipole amplitudes in experts
significantly differed from the right hemisphere amplitude in
rhythmically unskilled non-musicians (P < 0.05) suggesting
increased sensitivity to incongruities as distinguished from
merely a change in hemispheric dominance.
Expert
Unskilled
Left
Right
MMN(m)
ft/cm
100
Left
ft/cm
ft/cm
channel 242
expert
100
P3a(m)
-100
Right
channel 2642
expert
100
ft/cm
channel 1613
inept
channel 1333
inept
100
sI
sII
sIII
-100
-100
-100
sIII
sII
sI
ms
100
500
Fig. 2 – Evoked responses. Typical time courses of averaged magnetic evoked responses from one rhythmically unskilled
non-musician and one expert musician. Signals are recorded at the auditory cortex and plotted separately for sI, sII and sIII.
Inspecting the ERFs to sI we found no response (N1), probably due to refraction of the neurons. As a consequence of this, and
also because of the strong and early ERFs produced by sII and sIII, difference waves between these and sI were in most cases
visually similar to the curves produced by sIII and sII.
85
cortex 45 (2009) 80–92
Right
Left
sIII: MGA
a
ft/cm
MMN(m)
sII: MGA
40
200
400
0
600
ft/cm
200
400
600
ft/cm
Experts
Unskilled
40
Experts
Unskilled
40
10
10
200
400
600
0
ft/cm
sI: MGA
Experts
Unskilled
10
10
0
c
ft/cm
Experts
Unskilled
0
b
P3a(m)
40
200
400
600
ft/cm
40
Experts
Unskilled
Experts
Unskilled
40
10
10
0
200
400
0
600
200
400
600
Control Study (sIII)
ft/cm
ft/cm
Experts
Unskilled
40
Experts
Unskilled
40
MGA
d
10
10
0
200
400
0
600
200
400
600
Fig. 3 – Grand means. Grand mean of the MGAs for experts and rhythmically unskilled non-musicians plotted for (a) sIII, (b)
sII, and (c) sI. (d) Control study for physical summation in sIII (six experts, five rhythmically unskilled non-musicians).
Mean MGA
40.0
For sII, no dipolar sources could be obtained for rhythmically unskilled non-musicians. Left hemispheric MMNmdipole sources could be obtained in six out of nine expert
subjects but only in three out of nine experts in the right
hemisphere. For experts there was no significant difference in
the left hemisphere between localized sources of MMNms to
sIII and sII in paired t-tests on each coordinate (<1 mm for
each coordinate on average, SE < 4, P > 0.8), and the average
distance between MMNms to sIII and sII was less that 8 mm,
suggesting similar neuronal sources for MMNms to sII and sIII.
f T/cm2
20.0
3.3.2.
Experts
Unskilled
5.0
sI
sII
sIII
Sequence
Fig. 4 – Mean MGAs of experts and unskilled subjects.
Mean of the MGAs across hemispheres and peaks, for the
two groups of subjects, for each sequence. Error bars
denote standard mean of error.
P3am
P3am dipolar sources were localized in all experts but only
half of the rhythmically unskilled non-musicians. In experts
the difference between the localization of MMNm and P3am
sources was not significant in paired t-tests on each coordinate in each hemisphere. However the average distance was
2.0 cm (SE ¼ 0.4) in the left hemisphere and 2.6 cm (SE ¼ 0.6) in
the right. This reflects a large inter-subject variability in the
estimate of source localization of P3a. In most subjects it was
located near primary auditory cortex, but in some subjects it
appeared to have frontal localization while in others the
localization estimate was in temporo-parietal cortex. Due to
signal strength, it was not possible to make a meaningful
86
cortex 45 (2009) 80–92
Table 2 – Mean gradient amplitude (MGAs) for experts and rhythmically unskilled non-musicians.
Latency [ms]
MGA [ft/cm]
tMGA [ft/cm]
Asymmetry index
Left
Right
Left
Right
sI
100–170 ms
Experts
Unskilled
All
148.8 (6.7)
140.4 (7.2)
144.6 (4.8)
131.0 (7.8)
136.4 (7.7)
133.6 (5.4)
10.1 (1.0)
7.5 (0.6)
8.9 (0.7)
11.6 (1.0)
8.5 (1.1)
10.1 (0.8)
21.8 (1.8)
15.9 (1.4)
19.0 (1.3)
0.068 (0.041)
0.042 (0.066)
0.056 (0.036)
170–250 ms
Experts
Unskilled
All
206.8 (5.9)
200.1 (9.2)
203.6 (5.3)
201.0 (6.9)
210.5 (8.3)
205.5 (5.3)
8.6 (0.9)
7.2 (0.5)
7.9 (0.5)
12.2 (1.6)
7.5 (0.5)
10.0 (1.1)
20.8 (2.2)
14.7 (0.7)
17.9 (1.4)
0.152 (0.058)
0.021 (0.044)
0.090 (0.040)
sII
100–170 ms
Experts
Unskilled
All
130.4 (7.0)
140.8 (6.1)
135.3 (4.7)
139.8 (6.1)
140.8 (9.1)
140.3 (5.2)
22.0 (4.5)
6.9 (0.6)
14.9 (3.0)
14.8 (1.6)
11.6 (1.4)
13.3 (1.1)
36.7 (5.7)
18.5 (1.4)
28.2 (3.8)
0.139 (0.086)
0.235 (0.071)
0.037 (0.07)
170–250 ms
Experts
Unskilled
All
204.2 (6.0)
202.8 (11.0)
203.5 (5.9)
198.4 (4.8)
202.6 (10.9)
199.8 (5.6)
14.7 (2.3)
8.4 (0.7)
11.8 (1.5)
14.7 (1.5)
10.1 (0.6)
12.5 (1.0)
29.4 (3.7)
18.5 (0.9)
24.3 (2.4)
0.017 (0.046)
0.097 (0.050)
0.055 (0.035)
sIII
100–170 ms
Experts
Unskilled
All
113.2 (4.6)
141.4 (5.2)
126.5 (4.9)
116.9 (3.5)
115.4 (3.8)
116.2 (2.5)
46.2 (5.1)
25.4 (2.9)
36.4 (3.9)
39.0 (4.5)
36.4 (2.9)
37.7 (2.7)
85.2 (8.0)
61.8 (4.9)
74.2 (5.5)
0.083 (0.065)
0.193 (0.073)
0.047 (0.058)
170–250 ms
Experts
Unskilled
All
182.7 (3.0)
204.7 (14.8)
193.1 (7.4)
192.2 (5.7)
208.0 (7.7)
199.6 (5.0)
29.2 (2.3)
15.4 (3.2)
22.7 (2.5)
31.8 (3.3)
19.5 (3.1)
26.0 (2.7)
61.0 (4.8)
34.9 (5.5)
48.7 (4.8)
0.036 (0.042)
0.128 (0.106)
0.079 (0.054)
Standard error of mean (SE) is denoted in parentheses. tMGA ¼ MGAleft þ MGAright. The asymmetry index is calculated as (MGAright MGAleft)/
(MGAright þ MGAleft).
LEFT
ft/cm
RIGHT
MMN
40
30
20
sI
sII
sIII
10
RIGHT
20
Experts Unskilled Experts Unskilled
150
sIII
30
0
Experts Unskilled Experts Unskilled
RIGHT
P3a
40
10
0
LEFT
LEFT
ft/cm
50
Amplitude
Amplitude
50
ms
LEFT
RIGHT
LEFT
140
sIII
130
sII
RIGHT
120
Experts
Unskilled
sI
110
Experts
Unskilled
100
-.4
-.2
.0
.2
Asymmetry Index (MMN)
.4
Experts Unskilled Experts Unskilled
Latency (MMN)
-.4
-.2
.0
.2
.4
Asymmetry Index (MGAs: MMN)
Fig. 5 – Top, MGAs at the time of MMNm and the P3am, in the left and right hemisphere for experts and rhythmically
unskilled non-musicians. Bottom (left to right), asymmetry indices calculated on the basis of dipole amplitudes; MMNm
latency in experts and rhythmically unskilled non-musicians; asymmetry indices calculated on the basis of the MGAs.
87
cortex 45 (2009) 80–92
Table 3 – Dipoles. Summary of localization, dipole moment (amplitude), goodness of fit and confidence volume of the
neuronal sources for the MMNm and the P3am for experts and rhythmically unskilled non-musicians.
Left hemisphere
Right hemisphere
No. of Dipole moment Goodness of Confidence No. of subjects
Dipole
Goodness Confidence
moment [nAm] of fit [%] volume [mm2]
subjects
[nAm]
fit [%]
volume [mm2]
sIII
MMN
Experts
Unskilled
9
8
62.7
25.7
90.9
89.1
42.4
62.9
9
8
37.4
37.7
78.5
87.8
37.1
43.1
P3a
Experts
Unskilled
9
3
36.3
22.2
77.5
80.0
101.7
340.2
9
3
40.1
21.4
81.9
78.4
94.5
164.6
sII
MMN
Experts
Unskilled
6
–
41.3
–
87.6
–
178.1
–
1
–
11.4
–
66.6
–
231.1
–
P3a
Experts
Unskilled
5
–
24.3
–
79.6
–
242.1
–
3
–
18.6
–
86.9
–
188.1
–
triple dipole fit to the data. However, the observed pattern is in
accordance with P3am reflecting activity in a network
centered in the auditory cortex and encompassing also frontal
and parietal areas.
3.4.
Latencies
MMNm latency to rhythmic incongruity (sII and sIII) in experts
was shorter than in rhythmically unskilled subjects (F ¼ 5.20,
P < 0.05, three-way ANOVA, factors: group of subjects, type of
incongruence and hemisphere). There was significant effect of
type of incongruence sII (F ¼ 14.99, P < 0.001) indicating earlier
responses to stronger rhythmic incongruence supporting our
findings of larger MGAs to stronger rhythmic incongruence (see
also Tiitinen et al., 1994). Furthermore there was a significant
interaction between group and hemisphere (F ¼ 5.47, P < 0.05)
(Table 2). No significant effects were found for P3am.
MMNm latency to sIII showed significant effect of group
(F ¼ 9.54, P < 0.01, two-way ANOVA, factors: group of subjects
and hemisphere), hemisphere (F ¼ 6.60, P < 0.05) and an
interaction between group and hemisphere (F ¼ 11.72,
P < 0.01). For experts, MMNm latency in the left hemisphere
was comparable (P > 0.25) to that of the right hemisphere.
However, for rhythmically unskilled non-musicians, left
hemisphere latency exceeded the right hemisphere (P < 0.05),
and also the left hemisphere of experts (P < 0.001) (Fig. 6).
Right hemisphere MMNm latency to sIII did not differ between
groups (P > 0.5, Table 2).
Fig. 6 – Localization. Dipole localization of MMNms to sIII in one expert and one rhythmically unskilled non-musicians (the
same subjects as depicted in Fig. 2). Dipole directions are projected onto the individual coronal, axial and sagittal MR-slices.
The relative amplitude (left expert: 60 nAm) is represented by size of arrows.
88
3.5.
cortex 45 (2009) 80–92
Control experiment
One could hypothetically argue that the peak in the ERF at
110–150 ms measured after the bass drum beat in sIII, resembles an N1 (an obligatory negative deflection on the ERP/ERF,
with a latency of around 100 ms, in response to the advent of
a sound), due to physical summation of the bass drum beat and
the immediately preceding snare drum in our experimental
design. To counter this, we ran a control condition in which 13
of the participants listened to the bass drum/hi hat alternatively with a preceding snare drum/hi hat occurring 105 ms
before. The distance between two bass drum beats in this
condition was 420 ms such that a simple 2/4 would be
perceived as the basic meter and that the 105 ms distance
between the SD/HH and the BD/HH would be perceived as
a 16th note. We observed only small or no responses to the bass
drum beat preceded by the snare drum beat and no robust
dipolar field patterns in any of the participants, indicating that
the part played by physical summation was small compared to
the effect of rhythmic incongruity (Fig. 3).
4.
Discussion
Using MEG, we set out to (1) examine whether the components
of neuronal markers of perceived rhythmic incongruity would
be consistent with a ‘‘predictive coding’’ framework and (2) test
whether subject’s competence affected the composition of the
neural response. We found event-related responses to strong
rhythmic incongruence (sIII) in all subjects, the MMNm peaking
at 110–130 ms and the P3am around 80 ms after the MMNm in
expert jazz musicians and some of the rhythmically unskilled
subjects, as well as responses to subtle rhythmic incongruence
(sII) in most of the expert musicians. The MMNms were localized to the auditory cortices, whereas the P3am showed greater
variance in localization between individual subjects. MMNms
of expert musicians were stronger in the left hemisphere than
in the right hemisphere in contrast to P3ams showing a slight
non-significant right lateralization. As we shall argue below,
we interpret MMNm and P3am as reflecting an error term
generated in the auditory cortex and its subsequent evaluation
in a broader network including generators in the auditory
cortex as well as higher level neuronal sources. This is in
keeping with expectations based on the predictive coding
theory and suggests that there is congruence between
perceptual experience of rhythmic incongruities and the way
that these are processed by the brain.
Furthermore, we found enhanced processing of rhythmic
deviants in expert musicians compared to rhythmically
unskilled non-musicians both at the level of the MMNm and the
P3am. MMNm from the left hemisphere in experts to the highly
rhythmically incongruent metrical violation had much larger
amplitude and peaked 30 ms earlier than for rhythmically
unskilled non-musicians. In addition, we observed left lateralization of the MMNm in experts compared to non-musicians to
both subtle and strong metric violation, consistent with earlier
suggestions of music being left-lateralized in musicians
(Altenmuller, 2001; Bever and Chiarello, 1974; Ohnishi et al.,
2001), but no lateralization of the P3am. Expert musicians were
much more advanced rhythmically and metrically than the
rhythmically unskilled non-musicians. We therefore propose
that the difference in lateralization and strength of the MMNm
in neural response between experts and rhythmically unskilled
non-musicians reflects a stronger metrical predictive structure
in the experts, which again affects the evaluation of the error as
reflected by the stronger P3am. This is in keeping with the
predictive coding theory, according to which, the size of the
error term is dependent on the prediction.
4.1.
Perception of rhythm and predictive coding
Music presents itself to the brain as an organised, extended
time series, and the significance of each event is played out
against this larger temporal structure of expectations, anticipations and tensions, provided by the meter. The interplay
between local events and global structure is of paramount
importance to music perception and it must, arguably, also be
reflected in the brain’s processing of music. The anticipatory
model, against which the brain compares incoming input,
must be generated on the spot, without determined prior
knowledge. One solution to this problem is a hierarchical
Bayesian framework that estimates causes through
a comparison between estimates and incoming events. In such
a ‘‘Bayesian inference machine’’ (Friston, 2002) units of the
brain are driven to minimize error by some mechanism which
implicitly renders posterior estimates of causes given the data.
The brain is therefore a system that is changed by differences
between the expectation of input and the actual input on all
levels in the hierarchy of neuron assemblies of the brain.
The MMN signal appears to have the properties of an error
signal in a predictive coding framework. MMNs have been
found not only to pattern deviations determined by physical
parameters such as frequency (Sams et al., 1985), intensity
(Naatanen et al., 1987), spatial localization (Paavilainen et al.,
1989), and duration (Naatanen et al., 1989), but also to patterns
with more abstract properties (Paavilainen et al., 2001; Van
Zuijen et al., 2004). We observed MMNs when metric structure
was violated in such a way, that the input could not be predicted by the meter established in the previous bars. Previously
MMNms have been found to disruptions of a regular sequence
of identical sounds (Ford and Hillyard, 1981; Imada et al., 1993;
Nordby et al., 1988). Takegata et al. (2001) have shown that
MMN increase in size, when a strong regularity, created by two
different sounds, is violated compared with a situation where
the two different sounds do not establish a common temporal
framework. The present study extends these findings to
violations of metric anticipation structure created by the more
complex pattern of a drum sequence in keeping with the model
adjustment hypothesis of Winkler et al. (1996).
One of the strong claims in the predictive coding framework
is that ‘‘the specialization of any region is determined both
by bottom-up driving inputs and by top–down prediction’’
(Friston, 2002, p. 247). This suggests that an error term fed
upwards is a sign to higher areas that something in the
predictive process went wrong. We should therefore expect that
the error signal, generated locally, will produce effects integrating across brain levels. In the following, we shall argue that
the P3a may be an index of such integration across brain levels.
We observed a P3am, peaking at around 80 ms following
the MMN. The P3a is typically observed in passive listening
89
cortex 45 (2009) 80–92
conditions with salient deviants, and is thought to reflect
neural activities involving involuntary attentional orienting.
This should be more likely to occur in the present paradigm
compared to the more common MMN-paradigms in which
subjects are watching video or reading a book. Importantly,
the task involved the last part of each stimulus and was
temporally separated from the time-points in which the
rhythmic deviations occurred.
The P3a has been linked to expectancy in general and
musical expectancy in particular and is sensitive to violations of metric (Jongsma et al., 2004), melodic (Trainor et al.,
2002) and harmonic (Janata, 1995) structure. The P3am
dipolar sources in the present study showed large intersubject location variability. We could only localize neuronal
sources to the P3am evoked by sIII in expert musicians.
Under these conditions, the localizations of the MMNm and
the P3am did not differ significantly. However, when
comparing MMNm and P3am in individual subjects, some
had almost identical dipole localization estimates while the
P3am dipole in other subjects were more frontally or parietally located. This finding is compatible with the P3am
component reflecting activity in a larger network which ties
together components from auditory cortex with parietal and
frontal brain regions and where the relative strength of the
individual sources varies between individual subjects.
However, due to signal strength, we were not able to test this
hypothesis directly.
In contrast to the MMNm, the P3am in the experts was not
left-lateralized. It is therefore very unlikely that the P3am
reflects only local activity in the auditory cortex. Previously
frontal (Daffner et al., 2000; Schroger et al., 2000), auditory (Alho
et al., 1998; Opitz et al., 1999) and temporo-parietal (Downar
et al., 2000; Knight et al., 1989) sources have been suggested for
the P3a (see also Friedman et al., 2001). We therefore suggest
that, in contrast with the MMNm, localized in the auditory
cortices, the observed P3am reflects activity in a larger network
Feedforward
Predictive Coding
Higher Level
Higher Level
Higher Level
P
P
P
Experts
P3a
error
error
MMN
P
I
LEFT
Lower level
Auditory Cortex
P3a
P3a
MMN
{
{
Primitive Intelligence
MMN
error
RIGHT
Auditory Cortex
Asymmetry indices
P3a
P3a
MMN
MMN
Experts
SIII
-.4
-.2
.0
.2
Dipole amplitude
.4
{
{
SIII
SII
SIII
Expert
Inept
SII
-.4
-.2
.0
.2
.4
MGA
Fig. 7 – Predictive coding. Top, the predictive coding model proposes a specific mode of interaction between lower level brain
regions (here the auditory cortices) and higher level neocortical structures. Functional integration among brain systems that
employ driving (bottom-up) and backward (top–down) connections mediate this adaptive and contextual specialization,
where higher level systems provide a prediction of the inputs to lower level regions and lower regions respond to failures to
predict with an error term, which is propagated to higher areas. We suggest that the MMN in primary auditory cortex is an
instance of such error signal and that the P3a reflects a functional integration that links the auditory cortex with frontal and
parietal brain regions. This allows for solving the conflict between input and prediction via changes in the higher level
representations, until the mismatch is ‘‘cancelled’’. Bottom, the difference between the lateralized MMN and the bilateral
P3am.
90
cortex 45 (2009) 80–92
that ties together components from the auditory cortex with
parietal and frontal brain regions. This is what would be
expected if an error signal to a metric prediction were to have
any effects in a predictive coding framework. We hence
suggest that the P3am is an indication of a neural network that
acts on the error signal of the MMNm (Fig. 7).
In a recent study of harmonic violation, Koelsch et al.
(2000) found an early error signal, the frontally located ERAN
(peaking at around 180 ms), elicited to harmonic inappropriate chords in the authentic cadence. Trainor et al. (2002)
found a frontal P3a peaking at 300–350 ms to violation of
melodic expectancy. We propose to extend our hierarchical
predictive coding framework also to this finding. The model
suggests that at a given level of processing, a mismatch
between prediction and actual activation generates an error
signal, which leads to an integration with higher areas. In
our experiment, the prediction error to the metric violation
originated in the auditory cortex. For more complex
patterns, according to the predictive coding model, the
processing, and hence the error signal, will move to ‘higher’
areas in the chain of auditory processing. Indeed, in the case
of harmonic expectancy the error signal appears to originate
in Broca’s area and its right hemisphere homologue (Maess
et al., 2001). Thereby the ERAN identified by Koelsch et al.
and the P3a identified by Trainor just as the MMNm and the
P3am found in our study can be seen as prediction errors
followed by an integration into a larger network, irrespective
of the different types of expectations violated. This suggests
that different anticipations set up in music, at least in terms
of metric and harmonic structures but probably also at other
levels, may interact with the brain in structurally similar
ways, by creating interplays of functional segregation and
integration.
The larger P3am and the larger MMNm on the left in
experts suggest that both the competence of the listener and
strength of the musical violation determine whether attention
is attracted to the stimulus. Interestingly, jazz musicians
responded stronger to the syncopation in sII than did nonmusicians, even though they should be more familiar with
such syncopations. We suggest that their expertise is effectuated via a more precise predictive coding process, which
relies both on a better top–down propagated model (the meter)
for the expected stimuli and a more specific processing in the
auditory cortex, particularly on the left.
At the core of the present study is a central discussion in
the theory and philosophy of music: the disagreement about
whether meter is caused by phenomenal accents (Meyer,
1956) in the musical pieces (inputs) or by mental structures
(Benjamin, 1984; Palmer and Krumhansl, 1990), not necessarily contained in the music as such. The differences
between brain responses of experts and rhythmically
unskilled non-musicians strongly indicate that neural operations for meter processing is affected by music competence
and/or culture (Drake and Ben El, 2003). Music is thus best
seen as a biocultural phenomenon (Cross, 2003) and the
concept of the meter is only meaningful in the interaction
between music and subject, between the work as such and
the mental representation of the underlying musical structure. These mental structures are, however, far from fully
understood.
Acknowledgements
The MEG measurements were conducted at BioMag Laboratory, Helsinki Brain Research Centre, Helsinki University
Central Hospital, Finland, and the study was funded by the
The Danish National Research Foundation.
references
Alho K, Winkler I, Escera C, Huotilainen M, Virtanen J,
Jaaskelainen IP, et al. Processing of novel sounds and
frequency changes in the human auditory cortex:
magnetoencephalographic recordings. Psychophysiology, 35:
211–224, 1998.
Altenmuller EO. How many music centers are in the brain? Annals
of the New York Academy of Sciences, 930: 273–280, 2001.
Benjamin WE. A theory of musical meter. Music Perception, 1: 355–
413, 1984.
Bever TG and Chiarello RJ. Cerebral dominance in musicians and
nonmusicians. Science, 185: 537–539, 1974.
Bharucha JJ and Stoeckig K. Reaction time and musical
expectancy: priming of chords. Journal of Experimental
Psychology, Human Perception and Performance, 12: 403–410, 1986.
Blakemore SJ, Goodbody SJ, and Wolpert DM. Predicting the
consequences of our own actions: the role of sensorimotor
context estimation. Journal of Neuroscience, 18: 7511–7518, 1998.
Bregman AS. Auditory Scene Analysis: the Perceptual Organization of
Sound. Cambridge, Massachusetts: The MIT Press, 1990.
Cooper GW and Meyer LB. The Rhythmic Structure of Music. Chicago:
The University of Chicago Press, 1960.
Cross I. Music as a biocultural phenomenon. Annals of the New
York Academy of Sciences, 999: 106–111, 2003.
Daffner KR, Mesulam MM, Holcomb PJ, Calvo V, Acar D,
Chabrerie A, et al. Disruption of attention to novel events after
frontal lobe injury in humans. Journal of Neurology,
Neurosurgery and Psychiatry, 68: 18–24, 2000.
Dalla BS and Peretz I. Differentiation of classical music requires
little learning but rhythm. Cognition, 96: B65–B78, 2005.
Downar J, Crawley AP, Mikulis DJ, and Davis KD. A multimodal
cortical network for the detection of changes in the sensory
environment. Nature Neuroscience, 3: 277–283, 2000.
Drake C and Ben El HJ. Synchronizing with music: intercultural
differences. Annals of the New York Academy of Sciences, 999:
429–437, 2003.
Escera C, Alho K, Schroger E, and Winkler I. Involuntary attention
and distractibility as evaluated with event-related brain
potentials. Audiology and Neuro-otology, 5: 151–166, 2000.
Escera C, Alho K, Winkler I, and Naatanen R. Neural mechanisms
of involuntary attention to acoustic novelty and change.
Journal of Cognitive Neuroscience, 10: 590–604, 1998.
Ford JM and Hillyard SA. Event-related potentials (ERPs) to
interruptions of a steady rhythm. Psychophysiology, 18: 322–
330, 1981.
Friedman D, Cycowicz YM, and Gaeta H. The novelty P3: an eventrelated brain potential (ERP) sign of the brain’s evaluation of
novelty. Neuroscience and Biobehavioral Reviews, 25: 355–373, 2001.
Friston K. Beyond phrenology: what can neuroimaging tell us
about distributed circuitry? Annual Review of Neuroscience, 25:
221–250, 2002.
Frith C and Dolan RJ. Brain mechanisms associated with top–down
processes in perception. Philosophical Transactions of the Royal
Society of London. Series B, Biological Sciences, 352: 1221–1230, 1997.
Huron D. Music-engendered laughter: an analysis of humor
devices in PDQ bach. In Proceedings of the Eighth International
Conference of Music Perception and Cognition, Evanston, IL, 2004.
cortex 45 (2009) 80–92
Imada T, Fukuda K, Kawakatsu M, Masjiko T, Odaka K, Hayashi M,
et al. Mismatch fields evoked by a rhythm passage. Advances
in biomagnetism, In The Ninth International Conference on
Biomagnetism, Vienna, Austria; 1993. p. 118–9.
Janata P. Erp measures assay the degree of expectancy violation
of harmonic contexts in music. Journal of Cognitive Neuroscience,
7: 153–164, 1995.
Jones MR and Boltz M. Dynamic attending and responses to time.
Psychological Review, 96: 459–491, 1989.
Jones MR and Yee W. Sensitivity to time change: the role of
context and skill. Journal of Experimental Psychology, Human
Perception and Performance, 23: 693–709, 1997.
Jongsma ML, Desain P, and Honing H. Rhythmic context
influences the auditory evoked potentials of musicians and
nonmusicians. Biological Psychology, 66: 129–152, 2004.
Kernfeld B. The New Grove Dictionary. New York: St. Martin’s Press,
2002.
Kersten D, Mamassian P, and Yuille A. Object perception as
Bayesian inference. Annual Review of Psychology, 55: 271–304,
2004.
Knight RT, Scabini D, Woods DL, and Clayworth CC. Contributions
of temporal–parietal junction to the human auditory P3. Brain
Research, 502: 109–116, 1989.
Koelsch S, Gunter T, Friederici AD, and Schroger E. Brain indices
of music processing: ‘‘nonmusicians’’ are musical. Journal of
Cognitive Neuroscience, 12: 520–541, 2000.
Large EW and Kolen JF. Resonance and the perception of musical
meter. Connection Science, 6: 177–208, 1994.
Liegeois-Chauvel C, Peretz I, Babai M, Laguitton V, and Chauvel P.
Contribution of different cortical areas in the temporal lobes
to music processing. Brain, 121: 1853–1867, 1998.
Luria AR, Tsvetkova LS, and Futer DS. Aphasia in a composer (V.G.
Shebalin). Journal of the Neurological Sciences, 2: 288–292, 1965.
Maess B, Koelsch S, Gunter TC, and Friederici AD. Musical syntax
is processed in Broca’s area: an MEG study. Nature
Neuroscience, 4: 540–545, 2001.
Mehta MR. Neuronal dynamics of predictive coding.
Neuroscientist, 7: 490–495, 2001.
Mendez MF. Generalized auditory agnosia with spared music
recognition in a left-hander. Analysis of a case with a right
temporal stroke. Cortex, 37: 139–150, 2001.
Meyer L. Emotion and Meaning in Music. Chicago: University of
Chicago Press, 1956.
Naatanen R. Attention and Brain Function. London: Lawrence
Erlbaum Ass, 1992: 102–211.
Naatanen R, Paavilainen P, Alho K, Reinikainen K, and Sams M.
The mismatch negativity to intensity changes in an auditory
stimulus sequence. Electroencephalography and Clinical
Neurophysiology, 40: 125–131, 1987.
Naatanen R, Paavilainen P, and Reinikainen K. Do event-related
potentials to infrequent decrements in duration of auditory
stimuli demonstrate a memory trace in man? Neuroscience
Letters, 107: 347–352, 1989.
Nordby H, Roth WT, and Pfefferbaum A. Event-related potentials
to time-deviant and pitch-deviant tones. Psychophysiology, 25:
249–261, 1988.
Ohnishi T, Matsuda H, Asada T, Aruga M, Hirakata M,
Nishikawa M, et al. Functional anatomy of musical perception
in musicians. Cerebral Cortex, 11: 754–760, 2001.
Oldfield RC. The assessment and analysis of handedness: the
Edinburgh inventory. Neuropsychologia, 9: 97–113, 1971.
Opitz B, Mecklinger A, Friederici AD, and Von Cramon DY. The
functional neuroanatomy of novelty processing: integrating
ERP and fMRI results. Cerebral Cortex, 9: 379–391, 1999.
Paavilainen P, Karlsson ML, Reinikainen K, and Naatanen R.
Mismatch negativity to change in spatial location of an
auditory stimulus. Electroencephalography and Clinical
Neurophysiology, 73: 129–141, 1989.
91
Paavilainen P, Simola J, Jaramillo M, Naatanen R, and Winkler I.
Preattentive extraction of abstract feature conjunctions from
auditory stimulation as reflected by the mismatch negativity
(MMN). Psychophysiology, 38: 359–365, 2001.
Palmer C and Krumhansl CL. Mental representations for musical
meter. Journal of Experimental Psychology, Human Perception and
Performance, 16: 728–741, 1990.
Peretz I and Coltheart M. Modularity of music processing. Nature
Neuroscience, 6: 688–691, 2003.
Peretz I, Kolinsky R, Tramo M, Labrecque R, Hublet C, Demeurisse G,
et al. Functional dissociations following bilateral lesions of
auditory cortex. Brain, 117: 1283–1301, 1994.
Rao RP and Ballard DH. Predictive coding in the visual cortex:
a functional interpretation of some extra-classical receptivefield effects. Nature Neuroscience, 2: 79–87, 1999.
Roepstorff A and Frith C. What’s at the top in the top–down
control of action? Script-sharing and ‘top–top’ control of
action in cognitive experiments. Psychological Research, 68: 189–
198, 2004.
Russeler J, Altenmuller E, Nager W, Kohlmetz C, and Munte TF.
Event-related brain potentials to sound omissions differ in
musicians and non-musicians. Neuroscience Letters, 308: 33–36,
2001.
Sams M, Paavilainen P, Alho K, and Naatanen R. Auditory
frequency discrimination and event-related potentials.
Electroencephalography and Clinical Neurophysiology, 62: 437–448,
1985.
Schmuckler MA and Boltz MG. Harmonic and rhythmic
influences on musical expectancy. Perception and Psychophysics,
56: 313–325, 1994.
Schroger E, Giard MH, and Wolff C. Auditory distraction: eventrelated potential and behavioral indices. Clinical
Neurophysiology, 111: 1450–1460, 2000.
Schultz W and Dickinson A. Neuronal coding of prediction errors.
Annual Review of Neuroscience, 23: 473–500, 2000.
Shepard RN. Perceptual-cognitive universals as reflections of the
world. Behavioral and Brain Sciences, 24: 581–601, 2001.
Signoret JL, Van Eeckhout P, Poncet M, and Castaigne P. Aphasia
without amusia in a blind organist. Verbal alexia-agraphia
without musical alexia-agraphia in braille. Revue Neurologique
(Paris), 143: 172–181, 1987.
Sloboda J. The Musical Mind. Oxford: Oxford University Press, 1985.
Takegata R, Syssoeva O, Winkler I, Paavilainen P, and Naatanen R.
Common neural mechanism for processing onset-to-onset
intervals and silent gaps in sound sequences. Neuroreport, 12:
1783–1787, 2001.
Tiitinen H, May P, Reinikainen K, and Naatanen R. Attentive
novelty detection in humans is governed by pre-attentive
sensory memory. Nature, 372: 90–92, 1994.
Tononi G and Edelman GM. Consciousness and the integration of
information in the brain. Advances in Neurology, 77: 245–279,
1998.
Trainor LJ, McDonald KL, and Alain C. Automatic and controlled
processing of melodic contour and interval information
measured by electrical brain activity. Journal of Cognitive
Neuroscience, 14: 430–442, 2002.
Van Zuijen TL, Sussman E, Winkler I, Naatanen R, and
Tervaniemi M. Grouping of sequential sounds – an eventrelated potential study comparing musicians and
nonmusicians. Journal of Cognitive Neuroscience, 16: 331–338,
2004.
Vuust P. Polyrhythm and Metre in Modern Jazz – a Study of the Miles
Davis’ Quintet of the 1960’ies (Danish). Aarhus, Denmark: Royal
Academy of Music, 2000.
Vuust P, Pallesen KJ, Bailey C, Van Zuijen TL, Gjedde A,
Roepstorff A, et al. To musicians, the message is in the meter
pre-attentive neuronal responses to incongruent rhythm are
left-lateralized in musicians. Neuroimage, 24: 560–564, 2005.
92
cortex 45 (2009) 80–92
Vuust P, Roepstorff A, Wallentin M, Mouridsen K, and
Ostergaard L. It don’t mean a thing. Keeping the rhythm
during polyrhythmic tension, activates language areas (BA47).
Neuroimage, 31: 832–841, 2006.
Winkler I, Karmos G, and Naatanen R. Adaptive modeling of the
unattended acoustic environment reflected in the mismatch
negativity event-related potential. Brain Research, 742: 239–252,
1996.
Wolpert DM, Doya K, and Kawato M. A unifying computational
framework for motor control and social interaction.
Philosophical Transactions of the Royal Society of London. Series B,
Biological Sciences, 358: 593–602, 2003.