© 2019 American Psychological Association 2019, Vol. 29, Nos. 2-3, 90 –99
0275-3987/19/$12.00 http://dx.doi.org/10.1037/pmu0000238
The Content and Functions of Vivid and Soothing Visual Imagery During
Music Listening: Findings From a Survey Study
Studies have suggested that visual imagery forms an important part of the listening experience and
might be one of the mechanisms by which music induces emotions in a listener. However, little is
known about the content, prevalence, and function of visual imagery during music listening. To that
end, an online survey was constructed to explore music-related visual imagery. This included 24
statements about visual imagery based on previous research and an open question regarding the
content of their inner images. Several standardized questionnaires (Vividness of Visual Imagery
Questionnaire and Goldsmiths Musical Sophistication Index) were included as well to investigate the
link to visual imagery in non-musical contexts and across individuals with various levels of musical
training. In total, 669 participants provided responses to an online survey. A factorial structure of
music and visual imagery statements provided a three-factor structure consisting of vivid, soothing,
and disruptive visual imagery, although the actual factor structure was nonidentical between the
musically trained and untrained respondents. Separate analyses of factors for musically trained and
untrained participants yielded a more parsimonious structure of visual imagery, which consisted of
vivid and soothing visual imagery. These two factors consistently exhibited different weights across
the items; for musically trained participants, vivid imagery was more related to modulating arousal
than for untrained participants. The ability to conjure up vivid visual imagery was only weakly
related to the presence of music-related visual imagery. A content analysis of the open question
revealed common themes that related to a mixture of concrete visual imagery (landscapes, images
of people, and scenes from past events) and abstract visual imagery (shapes, objects, and colors).
Implications of these findings for further studies on music-induced emotions are discussed with a
focus on a recent constructionist account of emotional meanings in music.
Keywords: music listening, visual imagery, emotion, musical training, online survey
Music is present in all cultures, and most individuals experience which music induces emotions in the listener, the other mecha-
moments in which they are moved by music. People can experi- nisms being brain stem reflex, evaluative conditioning, emotional
ence powerful emotional responses to music triggered by a song of contagion, episodic memory, and musical expectancy. The latest
their favorite band, a special performance or occasion, a personal version of this framework also includes rhythmic entrainment and
memory associated with a particular piece, or images they conjure aesthetic judgment (Juslin, 2013). Juslin (2013) suggested that
up during listening. For instance, individuals may visualize inter- visual imagery generally develops during preschool years and is
nal images that consist of pictorial representations (e.g., land- strongly influenced by culture and learning. Unlike other emotion-
scapes, people, or past events) or embodied image-schemata (e.g., induction mechanisms such as the brain stem reflex or evaluative
images of ascending or descending motion). Juslin and Västfjäll conditioning, which are both automatic and subliminal, visual
(2008) proposed that visual imagery is one of six mechanisms by imagery in response to music is typically experienced consciously
and therefore should be able to be changed or suppressed by the
individual. Before reviewing the relatively sparse literature on
music and visual imagery, it is important to note that music is
Research on cross-modal mappings of music (Eitan, 2017),
graphical representations of music (Küssner, 2013; Tan & Kelly,
nected to an overarching narrative) have been shown in several cians (but see also Aleman, Nieuwenstein, Böcker, & de Haan, 2000).
studies (Boltz, Ebendorf, & Field, 2009; Goldberg, Chattopad- However, specific expertise in a perceptual domain does not neces-
hyay, Gorn, & Rosenblatt, 1993), and there is strong evidence that sarily lead to enhanced vividness of visual imagery in that domain.
a combination of music and image elicits emotions (Eldar, Ganor, Sunday, McGugin, and Gauthier (2017) showed that domain-specific
Admon, Bleich, & Hendler, 2007; Geringer, Cassidy, & Byo, imagery (tested with car experts) correlates with general vividness of
1997; Vines, Krumhansl, Wanderley, & Levitin, 2006). Remark- imagery but not with perceptual or semantic expertise. Whether or not
ably, Eldar et al. (2007) were able to show that combining music the same lack of domain-specific imagery abilities would be found in
and film elicited increased activity in brain regions associated with music experts still remains to be tested.
emotional processing, whereas music alone did not evoke a dif- To establish a more solid empirical foundation for such studies,
ferential response in these areas. we must first investigate the exact nature of visual imagery during
Visuals— or indeed, visual imagery—in combination with mu- music listening because attempts to systematically explore the
sic have also been used in therapeutic contexts. Most research has contents of music-related visual imagery (MVI) have not been
been carried out within music therapy (Band, Quilter, & Miller,
made. In fact, most previous research has focused on visual im-
Aims Results
Our study on visual imagery during music listening therefore
had three main goals: Analysis Strategy
(1) To estimate the prevalence of visual imagery during The analysis was broken down into four sections that explored
music listening and collect detailed insights into the the topic in an incremental fashion; the first addressed the preva-
different kinds of visual imagery people experience lence of visual imagery in music listening using the representative
while listening to music. sample, the second explored the structure of the visual imagery
itself using the pooled sample of all participants and exploratory
(2) To explore how visual imagery in response to music is factor analysis (EFA) and confirmatory factor analysis (CFA), the
different from visual imagery in general. third scrutinized the impact of musical expertise on the discovered
structures of visual imagery, and the last section delved deeper into
(3) To investigate how visual imagery correlates with the contents of the MVI by summarizing the qualitative responses
domain-specific skills (Sunday et al., 2017). provided by the participants to an open question about the contents
of MVI.
Participants and Procedure Visual imagery seems to be a common feature of music listen-
ing. Of 500 respondents from our representative sample, 77.20%
Participants were obtained from two sources. A convenience (n ⫽ 386) indicated that they have experienced visual imagery
sample of 169 respondents was recruited via professional mailing during music listening before, whereas 22.80% (n ⫽ 114) reported
lists and social media (e.g., Twitter). A representative sample (N ⫽ that they have not experienced visual imagery.
500) of people living in the United Kingdom was obtained from
Dalia Research (https://daliaresearch.com). We decided to pool the
Structure Within the MVI Statements
samples because we did not see any major differences in their
background and for the purposes of the analyses. Both samples We applied robust procedures for establishing the plausible
were collected with identical sets of questions, except for Item 19, structures within the 24 items representing MVI. Factor analysis is
detailed later, that had a minor wording change1 and one additional the most commonly used technique to discover the potential struc-
question about the prevalence of visual imagery during music ture within correlated set of items. To maximize the robustness and
listening.2 The age range of the pooled sample was 18 to 79 years generalizability of the structure to be discovered, we first carried
(M ⫽ 29.98 years, SD ⫽ 9.58 years); 381 (56.95%) respondents out EFA to identify the potential structure (Fabrigar, Wegener,
were female. Ethical approval for this study was obtained from the MacCallum, & Strahan, 1999). Subsequently, we assessed the
Ethics Committee of the Faculty of Humanities and Social Sci- robustness of the found solution with CFA that computes the fit of
ences at Humboldt University Berlin. the proposed structure derived from the EFA using a separate
subset of the data (Schreiber, Nora, Stage, Barlow, & King, 2006).
To apply these analysis stages to our data, we randomly divided
Questionnaire the full sample into a training (60%) and test set (40%) and used
The questionnaire consisted of existing instruments and 24 the training set for the EFA and the test set for the CFA.
items designed to probe MVI that were developed based on pre- In the initial phase, the factorability of responses across the
vious studies on visualizations of music (Küssner, 2013; Küssner statements was conducted using the Kaiser–Meyer–Olkin measure
& Leech-Wilkinson, 2014). These 24 statements are given in full of sampling adequacy, yielding a robust indicator of factorability
in Appendix. We also included the Vividness of Visual Imagery (0.86), although one item fell below the recommended threshold of
Questionnaire (VVIQ; Marks, 1973), as it has been a successful 0.60 and was eliminated (Item 20: “The images I conjure up during
self-report tool for capturing differences in people’s ability to music listening only last a few moments”). The optimal number of
create visual imagery (Balteş & Miu, 2014; Campos, 2011). In factors to extract was determined with several methods: Velicer’s
addition, we asked the respondents about their active and passive Minimum Average Partial criterion (Zwick & Velicer, 1986),
engagement with various forms of art using questions such as components from parallel analysis (Horn, 1965), and very simple
attendance to dance performances/plays or art exhibitions. To structure (Revelle & Rocklin, 1979), as suggested by Ruscio and
capture their musical engagement and sophistication, we utilized Roche (2012). This comparison of the optimal factors offered
the Goldsmiths Musical Sophistication Index (Gold-MSI; Müllen-
siefen, Gingras, Musil, & Stewart, 2014) with its five subscales: 1
The wording of Item 19 was changed from “The images I conjure up
Active Musical Engagement, Self-Reported Perceptual Abilities, during music listening occur spontaneously” to “The images in my mind’s
Musical Training, Self-Reported Singing Abilities, and Sophisti- eye during music listening occur spontaneously” to account for the fact that
cated Emotional Engagement With Music. We also asked the spontaneously occurring visual imagery does not involve the active process
respondents for basic demographic information (age, gender, and of conjuring up images.
Because the prevalence could not be measured meaningfully with a
education) and music preferences (using a list of genres). Finally, convenience sample, we only included this question (“Have you ever
we provided an open-ended question about the content of MVI experienced visual imagery (i.e. images in your mind’s eye) while listening
(Footnote 2). to music?”) in the Dalia survey.
agreement among the methods for three components (as suggested the models (EFA or CFA), but the magnitude of the loadings
by parallel and very simple structure). A factor analysis with across the training and test set is consistent (an average deviation
oblimin rotation was utilized to increase the interpretability of the of ⫾0.085). The similarities between the models suggest that the
loadings. This model explained 37% of the variance and obtained CFA model is close to being acceptable, but strictly speaking, the
a decent fit to the data, root mean square error of approximation model fails to exceed the criteria for both measures of fit with CFI
(RMSEA) ⫽ 0.074, RMSEA 90% confidence interval (CI) [0.066, of 0.945 and RMSEA of 0.084 (see Table 2 for the full fit indices
0.077], df ⫽ 187, 2 ⫽ 507.97). Items with factor loadings less of the model).
than .45 and items that loaded on multiple factors were considered
as criteria for removal from the next round of analyses. The items
Structure Within Participants—Musical Training
removed were Items 4, 5, 6, 9, 13, 16, 19, and 21 (see Appendix
for the full set of MVI items). An alternative explanation for our results from the first model
The obtained factor structure was applied to the test set to could be that the structure is masked by interindividual differences.
estimate the fit of the model. In this CFA, we utilized the For instance, some people might find visual imagery to be a
Lavaan package of R (Version 0.5–23.1097) with robust maximum problematic part of listening, as the Factor 3 in the first analysis
likelihood-based estimator, which corrects for nonnormality, and suggested, while others have a contrary experience and feel rather
used 2 statistics and two recommended fit indices, RMSEA and stimulated by it. Participants with musical training or those heavily
comparative fit index (CFI). According to Hu and Bentler (1999), engaged in music may actually want to experience visual imagery
the desirable cutoff value for the RMSEA should be .06, and for in conjunction with the listening, and hence Factors 1 and 3 might
the CFI, it should be above .95. To improve the CFA model, we inherently create conflicts across the responses given by partici-
eliminated the items that achieved loadings ⬍.50 with the cova- pants with diverse interests and preferences or different levels of
riance matrix (Ximénez, 2009). musical training. To explore the role of musical training more
Table 1 displays the factor structure with the loadings from EFA fully, analysis of variance was applied to the factor scores from
and CFA. The first factor refers to active and vivid visual imagery CFA where the impact of the Gold-MSI variables and the activity
(Vivid Visual Imagery) obtained with any recorded or live music with respect to other arts (fiction, visual arts, dance, theater) to the
(Items 1 and 2) and their reverse variant (Item 3). The imagery is factor scores was examined. The scores of the first factor yielded
abstract (Item 14) or dynamic (Item 17), and it can happen with a significant main effect of Training, F(1, 657) ⫽ 33.63, p ⬍ .001,
both eyes open (Item 23) and closed (Item 24). The second factor G2 ⫽ .003; Perceptual, F(1, 657) ⫽ 16.44, p ⬍ .001, G2 ⫽ .006;
captures the emotional outcome of the process (Soothing Visual Emotions, F(1, 657) ⫽ 29.64, p ⬍ .001, G2 ⫽ .039; and Engage-
Imagery), namely, whether visual imagery makes the participants ment with Fiction, F(1, 657) ⫽ 9.87, p ⬍ .01, G2 ⫽ .002. The
feel relaxed (Item 10) or calm (Item 11). The third factor could be second factor showed significant effects with Perceptual, F(1,
interpreted to be a factor in which visual imagery is largely 657) ⫽ 14.10, p ⬍ .001, G2 ⫽ .005; Training, F(1, 657) ⫽ 37.67,
undesirable (Disruptive Visual Imagery): the participant is trying p ⬍ .001, G2 ⫽ .005; Emotions, F(1, 657) ⫽ 19.29, p ⬍ .001,
to suppress the imagery (Item 7), the imagery bothers them (Item G2 ⫽ .029; and Engagement with Fiction, F(1, 657) ⫽ 5.43, p ⬍
15), and such imagery seems to be static as well (Item 16). The .05, G2 ⫽ .002. The scores for the third factor showed a slightly
loadings presented in Table 1 are not especially high for either of similar pattern of main effects: Active, F(1, 657) ⫽ 13.28, p ⬍
Table 1
Loadings From the Exploratory Factor Analysis and Confirmatory Factor Analysis
Table 2
The Fit Indices of the Confirmatory Factor Analysis Models
Overall model (n ⫽ 669): three factors 68.88 24 .945 .084 [.061, .107]
Musically untrained (n ⫽ 450) two factors 10.01 8 .989 .037 [.001, .100]
Musically trained (n ⫽ 219): two factors 7.07 4 .988 .094 [.001, .205]
Note. CFI ⫽ comparative fit index; RMSEA ⫽ root mean square error of approximation; CI ⫽ confidence
.001, G2 ⫽ .002; Perceptual, F(1, 657) ⫽ 59.49, p ⬍ .001, G2 ⫽ Whereas vivid visual imagery is one of the main reasons for music
.015); Singing, F(1, 657) ⫽ 6.45, p ⬍ .001, G2 ⫽ .000; Emotions, listening in both musically trained and untrained participants, it
F(1, 657) ⫽ 27.03, p ⬍ .001, G2 ⫽ .023; and Engagement with serves the additional purpose of increasing arousal for musically
Theater, F(1, 657) ⫽ 12.87, p ⬍ .001, G2 ⫽ .001. These links trained participants in listening situations by making them feel
between musical training and factor structures suggest that the more energetic and excited. However, for musically untrained
actual factors themselves may be slightly different depending on individuals, these vivid images are always present as an integral
the level of musical training of the participants.
To address the dependence of factor structure on musical train-
ing, we applied cluster analysis to all normalized Gold-MSI vari-
ables. Two optimal clusters were identified by using a silhouette
technique (Rousseeuw, 1987) that utilizes a bootstrapping (1,000
draws) to examine the clustering quality. The first cluster (n ⫽
450) contains the musically untrained participants (who score low
on the Gold-MSI subscale Musical Training: M ⫽ 16.81, SD ⫽
7.88) compared with the participants in the second cluster (n ⫽
219), who seem to be musically trained (Gold-MSI score on
Musical Training: M ⫽ 33.01, SD ⫽ 9.12; see Figure 1 for an
illustration of the clusters and Gold-MSI variables).
Having discovered the internal structure that divides the partic-
ipants into two clusters, we can now reestimate the factor solutions
separately for both subsets, using similar 60%/40% training and
testing subsets, which leaves us with 131 musically trained par-
ticipants to build the EFA model and 88 to test the model. For
musically untrained participants, the EFA analysis suggests two
factors, possibly labeled as Vivid Visual Imagery and Soothing
Visual Imagery, in which the vivid factor has items such as Item
18, “I see images in my mind’s eye whenever I listen to music,”
Item 23, “I often conjure up images while listening to music with
eyes open,” and Item 5, “The images I conjure up during music
listening are one of the main reasons why I listen to music.” More
importantly, this model provides a very good fit with the unseen
data (RMSEA ⫽ 0.037, RMSEA 90% CI [0.001, 0.010], df ⫽ 8,
2 ⫽ 10.01). When an identical analysis is conducted for musically
trained participants, this yields also a two-structure solution, where
both structures refer to emotions. The first one relates to vivid
visual imagery (Item 5: “The images I conjure up during music
listening are one of the main reasons why I listen to music”, Item
12: “The images I see in my mind’s eye when listening to music
make me feel excited”, Item 13: “The images I see in my mind’s
eye when listening to music make me feel energetic”) and the
second one to soothing visual imagery (Item 10: “The images I see
in my mind’s eye when listening to music make me feel relaxed”,
Item 11: “The images I see in my mind’s eye when listening to
music make me feel calm”) (Figure 2).
In essence, the revised analysis suggests that there is a general
structure within the MVI questions that relates to vivid and sooth- Figure 1. Descriptive summary of the two clusters (1: musically un-
ing visual imagery, despite this structure taking different forms trained, 2: musically trained) based on Goldsmiths Musical Sophistication
across participants with varying amounts of musical training. Index variables.
Figure 2. Confirmatory factor analysis models for musically trained and untrained participants.
part of the listening experience, but do not seem to modulate the Visual Imagery. The pattern between musical sophistication and
listeners’ emotions explicitly. Furthermore, soothing visual imag- MVI factors suggest that MVI is related to expertise, and partic-
ery is more similar across both groups. It makes both musically ularly, the active emotion regulation with imagery might show this
trained and untrained participants feel more relaxed and calm, relationship.
while also enabling untrained individuals to dive into a different In addition, activities with other arts were only weakly or not at
world and detach from everyday life. all correlated with the MVI factors scores (Active Arts–Fiction,
To test whether vividness of visual imagery and music-related |r| ⬍ .13, p ⫽ ns, Active Arts–Plays and Theater, |r| ⬍ .10, p ⫽ ns,
visual imagery utilize the same latent constructs, the scores from but Active Arts–Visual Arts and Vivid Imagery, r ⫽ .24, p ⫽ .001,
CFA analysis for each group were correlated with the VVIQ scores although other factor, r ⫽ .02, p ⫽ ns). The only statistically
as well as the scores of the five facets of Gold-MSI (Table 3). This significant correlation occurred between the Vivid Visual Imagery
analysis shows small albeit consistent positive correlations (we factor scores and the amount of activities in visual arts. This weak
have reversed the original VVIQ scale to be more intuitive so that association merely suggests that participants engaging in visual
low scores in VVIQ indicate lower vividness) between the MVI arts have higher scores in the Vivid Visual Imagery factor, al-
factors, mainly between MVI factor Vivid Visual Imagery for though the directionality of this relationship remains to be ex-
musically untrained and trained (r ⫽ .15 and r ⫽ .29, respec- plored.
tively). This correlation indicates about 8% overlap between viv-
idness of visual imagery and active musical visual imagery con-
Content of MVI
structs, which is worth noting, but does not give reason to assume
that the two measures are utilizing the same ability. Musical Asking participants to rate the relevance of more than 600
training seems to influence the relationship between vividness of expressions related to visual imagery and metaphors, Schaer-
visual imagery and the active musical visual imagery. Although laeken, Glowinski, Rappaz, and Grandjean (2019) were able to
this suggests that the association between vividness of visual show that the experience of visual imagery during music listening
imagery and active musical visual imagery has to do with expertise could be characterized by five factors: Flow, Movement, Force,
rather than differences in ability, the direction of this influence is Interior, and Wandering. Although this study delineates a potential
beyond the present design. structure underlying the (affective) experience of visual imagery
Interestingly, most facets of the Gold-MSI and both MVI factors during music listening, it does not provide the content or nature of
exhibit small correlations among the participants that possessed the inner images themselves. We therefore asked our participants
more formal musical training or exhibited otherwise moderate to in an open-ended question to describe the images they conjure up
high engagement with music. MVI factor Soothing Visual Imagery during music listening in as much detail as possible. The most
seems to be associated more strongly to Musical Training, Percep- frequent type of visual image that occurs during music listening
tual, and Active components in the Gold-MSI than with the Vivid (33 out of 169) was a landscape or a scene from nature, followed
Table 3 music and images over time might occur in special cases (i.e.,
Correlations Between the Music-Related Visual Imagery Factor synaesthesia) but are rather an exception. Free associations, both
Scores and Vividness of Visual Imagery Questionnaire Across concrete and abstract, that draw on experiences from everyday life
Musically Trained and Untrained Participants are much more common. Most images are ephemeral, dynamic,
evocative, and/or reflective: they allow the mind to wander from
Musical training one image to the next, zoom in and out of the music (or the
Questionnaire/Factor Untrained Trained images), and often seem to serve the purpose of modulating
emotional experience by either decreasing or increasing arousal.
MVI Vivid .15ⴱⴱ .29ⴱⴱⴱ
MVI Soothing .12ⴱ .18ⴱⴱ Discussion
Gold-MSI Training
MVI Vivid ⫺.06 ⫺.34 We present the first systematic study of the content of visual
MVI Soothing ⫺.06 ⫺.44ⴱⴱⴱ imagery during music listening in a representative sample using a
Gold-MSI Perceptual
MVI Vivid .08 ⫺.18ⴱⴱ new set of independent questionnaire items to study this experi-
MVI Soothing .02 ⫺.28ⴱⴱⴱ ence. Our first aim was to find out how common visual imagery is
Gold-MSI Emotional and to investigate the different kinds of inner images that individ-
MVI Vivid .11 .28ⴱⴱⴱ uals commonly experience when listening to music. We estab-
MVI Soothing .11 .23ⴱⴱⴱ
Gold-MSI Singing
lished that MVI is frequently experienced; 77% of a representative
MVI Vivid .03 ⫺.06 sample indicated that they have seen images in their mind’s eye
MVI Soothing ⫺.08 ⫺.08 during music listening. Using a combination of cluster and factor
Gold-MSI Active analyses, we were able to further demonstrate that MVI consists of
MVI Vivid .22ⴱⴱ ⫺.13ⴱ two components: Vivid Visual Imagery and Soothing Visual Im-
MVI Soothing .16ⴱ ⫺.19ⴱⴱ
agery. The latter relates to emotional effects of imagery that
Note. VVIQ ⫽ Vividness of Visual Imagery Questionnaire; Gold-MSI ⫽ usually pertain to relaxing and calm emotions, and seems to be
Goldsmiths Musical Sophistication Index; MVI ⫽ music-related visual marginally influenced by musical training. Vivid Visual Imagery,
p ⬍ .05. ⴱⴱ p ⬍ .01. ⴱⴱⴱ p ⬍ .001. the other component, was one of the fundamental reasons for
music listening across all participants. Still, vivid visual imagery
does seem to be experienced slightly differently in those with
by some autobiographical scene or event from the past (28) and extensive musical training as opposed to in those without, with a
images of people (28). Participants also often imagined a musical link between vivid visual imagery and energizing emotional ex-
performance (21), including details of the performer and the per- periences being present only in the former subgroup.
formance venue. Another theme appearing frequently is an image Our second aim was to explore the extent to which visual
of oneself (17), whether as a performer, member of the audience, imagery in response to music is different from visual imagery in
or character in a (fictive) narrative. Needless to say, the image of general. Although there were small correlations between the viv-
oneself being a world-class performer belongs, for the vast major- idness of visual imagery ability and MVI factors, this suggests that
ity, just as well in an imaginary world as seeing oneself flying over overlap between the two was negligible. This suggests separate
a landscape or into outer space. processes may govern generic visual imagery and MVI.
However, many people experience distinctly abstract visual Our third aim was to investigate how visual imagery correlates
imagery. Different colors and shades (20), animated shapes (20), with domain-specific skills such as musical training. Our observa-
and geometric objects and patterns (10) were frequently reported tion that MVI is influenced by musical training—whereas the
by participants. Musically trained participants, perhaps not surpris- vividness of visual imagery is little affected by training (Sunday et
ingly, often see images related to the musical structure, such as al., 2017)—indirectly supports this line of thought. In addition, the
melodic or instrument lines (five), the musical score (four), har- impact of musical training on visualizations of sound and music
mony (two), and tempo (one). has been shown before (Küssner & Leech-Wilkinson, 2014).
It is important to note that these emerging themes are not Whereas in that study, musically trained individuals showed a
mutually exclusive; that is, individuals can experience a mix of smaller range of audiovisual mapping strategies than musically
concrete and abstract visual imagery, and several categories can be untrained participants, our results indicate that both trained and
present at once or consecutively. What is more, certain musical untrained respondents show a great diversity of visual imagery.
pieces may elicit consistently the same (autobiographical) image, However, the functional uses of visual imagery may differ depend-
whereas inner images evoked by other musical excerpts may vary ing on the amount of musical training. Our results indicate that
from one to another listening situation. These differences can musically trained individuals seem to use inner images to modulate
occur both within as well as across participants. It should also be their level of arousal, either by increasing it via vivid visual
noted that the abovementioned themes of MVI occur across both imagery or decreasing it via soothing visual imagery. Further
Vivid Visual Imagery and Soothing Visual Imagery. Both types of research should investigate the differences between voluntary and
visual imagery contain a multiplicity of concrete and abstract involuntary visual imagery in a musical context in order to better
images that are shaped by highly personal experiences and asso- understand visual imagery’s utility (see also Taruffi & Küssner,
ciations. 2019). Although some music may automatically trigger culturally
In sum, visual imagery during music listening is a highly flex- mediated iconic imagery (e.g., a fanfare suggesting the image of a
ible and idiosyncratic phenomenon. Consistent mappings between hunt), the evocation of certain images is of course highly mediated
Twenty-Four Items of the Music-Related Visual Imagery Instrument and the Mean Ratings
Item M (SD)
10. The images I see in my mind’s eye when listening to music make me feel relaxed. 4.94 (1.14)
11. The images I see in my mind’s eye when listening to music make me feel calm. 4.89 (1.10)
12. The images I see in my mind’s eye when listening to music make me feel excited. 4.76 (1.18)
13. The images I see in my mind’s eye when listening to music make me feel energetic. 4.64 (1.24)
14. When I listen to music I see abstract figures and shapes in my mind’s eye. 3.87 (1.37)
15. I am often bothered by the images I see in my mind’s eye when listening to music. 2.77 (1.27)
16. The images I see in my mind’s eye when listening to music are static. 3.16 (1.17)
17. The images I see in my mind’s eye when listening to music are dynamic. 4.70 (1.19)
18. I see images in my mind’s eye whenever I listen to music. 4.22 (1.36)
19. The images in my mind’s eye during music listening occur spontaneously. 4.87 (1.11)
20. The images I conjure up during music listening only last a few moments. 3.99 (1.18)
21. The images I conjure up during music listening are often substituted by new images. 4.21 (1.18)
22. I often conjure up concrete scenes (e.g., landscapes, people, etc). 4.57 (1.39)
23. I often conjure up images while listening to music with eyes open. 4.63 (1.37)
24. I often conjure up images while listening to music with eyes closed. 4.80 (1.35)