Adami Multimodality Overview
Adami Multimodality Overview
Adami Multimodality Overview
Multimodality
Elisabetta Adami
Abstract
The chapter reviews the growing field of multimodality in relation to the study of language, text and
communication and it traces the developments of multimodality as a field of research, along with the
extant theoretical approaches to multimodal analysis. The chapter further discusses and exemplifies key
notions of a social semiotic perspective to multimodal analysis and mentions potentials and limitations,
pointing to future directions of research in the field. Rather than a comprehensive review of extant
studies in multimodality, the chapter discusses selected key assumptions, topics and analytical
developments in multimodal research that are relevant to its relation with language and society.
Multimodality is a concept introduced and developed in the last two decades to account for the
different resources used in communication to express meaning. The term is used both to describe a
phenomenon of human communication and to identify a diversified and growing field of research. As a
or modes, in texts and communicative events, such as still and moving image, speech, writing, layout,
As a phenomenon of communication, the term is used not only by multimodal analysts, but also, and
increasingly so, by works in disciplines concerned with texts and meaning, such as linguistics and
communication studies, all of which however tend to devote their analytical focus on language.
Within the field of multimodal studies (OHalloran and Smith 2011), the phenomenon of
2011), all hinging on four key assumptions (Jewitt 2014a), namely that (a) all communication is
multimodal; (b) analyses focused solely or primarily on language cannot adequately account for
meaning; (c) each mode has specific affordances arising from its materiality and from its social
histories which shape its resources to fulfill given communicative needs; and (d) modes concur
together, each with a specialized role, to meaning-making; hence relations among modes are key to
multiplicity of modes, all of which have been socially developed as resources to make meaning. Modes
such as gesture, sound, image, colour, or layout, for example, are conceived as sets of organized
resources that societies have developed each to a greater or lesser level of articulation in different
social groups to make meaning and to express and shape values, ideologies, and power relations.
When in combination with speech and/or writing, they are not a mere accompaniment of, or support to
verbal language, as labels such as para-/extra-linguistic or non-verbal might suggest; rather, each
concur with a specific functional load to the meaning made by the overall text and as such they
deserve attention.
All communication is, and has always been, multimodal (Kress and van Leeuwen 1996). Be it either
than one mode to make meaning. This might sound today like a commonplace; yet historically, the
dominant role attributed to verbal language, and the mode of writing especially, has overshadowed the
multiplicity of resources shaped socially to communicate. This has meant not only that societies have
developed the resources of speech and writing at a particularly high level of articulation, but also that
research and education have focused their almost exclusive attention to the development of descriptions
and the teaching of prescriptions and conventions for the use of language. As a result, the investigation
of other modes has been restricted to specialized fields, such as musicology, the arts etc.
In recent years, the social impact of digital technologies for text production, among other factors, has
made more visible the fact that texts are multimodal and hence that language alone cannot suffice to
explain meaning made through them. Digital technologies have reduced costs for the production of
printed images and the use of colour. Their (market-led) widespread use has made available to an
unprecedented number of sign-makers forms of text production that afford modes other than speech
and writing. Online environments have provided sign-makers with platforms and easy-to-use interfaces
for publishing their multimodal texts and distributing them to diversified audiences, thus making the
The digital texts we daily engage with make meaning through the combined use of colour, writing,
sound, images, and layout, at least. It is not only the case of texts that we encounter on the web, but
also of texts that we interact with daily, to fulfill ordinary tasks in our offline environments, such as
the interfaces displayed on the screens of ATM machines or those for purchasing a train ticket, for
instance. This holds also for the texts that we produce; everyday communication in digital
environments faces sign-makers with a wide range of modal options. The multimodal character of
digital texts is also redefining the use of the resources of language (van Leeuwen 2008); writing itself is
changing its functions, as lexis integrated in visual ensembles/syntagms (van Leeuwen 2004), or as
something to be acted upon rather than read, as in the case of URLs used as hyperlinks (Adami, 2015);
writing is also increasingly developing resources for meaning-making, like those of font (van Leeuwen
2005b; van Leeuwen 2006), which are generally disregarded in linguistic studies. While speech is
changing its functional load in the online homologue of face-to-face interaction, i.e., video-chats (for
the phenomenon of mode-switching in video-chats, Sindoni 2013), the mode of image is being used
for new interactive functions, as in Facebook comments, for example. Such a changed semiotic
contemporary communication, and to its usefulness as a notion that can account for contemporary
meaning-making.
Over the last two decades, disciplines concerned with text, discourse, and meaning have increasingly
devoted attention to non-verbal resources. Yet the point of reference and focus of analysis has
traditionally hinged on speech, with other modes considered as playing an accessory function, and
While modes of communication other than language are, to varying degrees, being attended to in
social linguistic work, its central units of analysis are usually linguistic units (e.g. intonation
unit) or units defined in linguistic terms (e.g. a turn is defined in terms of who is speaking)
The advent of digital technologies has contributed to changes in the perception of what constitutes data
in many text-based disciplines. Digital technologies provide analysts with multimodal means of
recording, coding and transcribing data, such as videos and video annotation systems. When analysing
a video-recorded rather than a tape-recorded face-to-face interaction, the multimodal character of the
communicative event becomes more immediately manifest and what could be regarded as context or
contextual information in earlier tape-recordings (something that the researcher could neither see nor
handle from tape-recorded data) is now visible as meanings expressed by participants through gestures,
movement, and face expressions, or through 3D objects. In this regard, Goodwins (2001) work has
opened a tradition of studies in conversation analysis that are now approaching multimodality as a
means to account in detail for meaning made through actions and their relations to speech (for a recent
Studies in corpus linguistics, which has developed tools and compiled, tagged and parsed corpora of
(predominantly) written and (to a lesser extent) spoken language yet transcribed in written form , are
now increasingly advocating the need of compiling multimodal corpora (Adolphs and Carter 2007;
Allwood 2008; Haugh 2009). However, these tend to assume a central role of speech and writing, with
communication acknowledge the multimodal nature of digital environments, like Herring (2010), who
argues that the interpretation of visual content can benefit from methods drawn from iconography and
semiotics (2010:244). Yet, in her review and development of methodologies for the analysis of web
content (2010:233), the multimodal nature of webtexts is referred to only in terms of the presence of
images, while the main reference point and concern is still on language and language-based interaction,
as if language (and hyperlinks) were the defining resource for the understanding of Web-based
phenomena like the blogosphere (see also, more recently, Herring 2013, in which the multimodality of
Web 2.0 texts is addressed more explicitly, and hypothesis are made on whether it should/could be
nevertheless maintaining that text remains the predominant channel of communication among web
users (2013:9)). In sum, studies in linguistics and communication have increasingly acknowledged the
multimodal nature of texts, yet, with the notable exception of Goodwins tradition, their main focus
In contrast, studies in multimodality assume that any analysis today can no longer rely only or mainly
on language, if it aims at interpreting the meanings of a text or communicative event, rather than
merely the use of (selected aspects of) speech or writing within them.
Given the increasingly manifest multimodal character of communication vis--vis the attention paid
historically at developing analytical labels and tools mainly to describe language, multimodality as a
field of research attends different tasks. It aims to investigate the meaning potentials of each mode
(including speech and writing, differently conceived of, through a multimodal lens), and to provide an
account of how each mode has been shaped historically in different cultures and societies to fulfill
particular tasks. It also aims to find common labels that can describe meaning made in all modes, to be
able to treat all modal resources in a unifying and coherent account. Finally it aims to describe and
explain meaning made through the relation among modes in multimodal ensembles, given that the
The next section traces the origins and developments of multimodal analysis, while briefly reviewing
different approaches and mentioning current work relevant to disciplines concerned with texts and
language. The following one discusses key notions of a social semiotic approach to multimodal
analysis, as a means of both looking at social phenomena through representation and communication,
mentions the potentials and limitations of the approach, opening to future research in the field.
Multimodality finds its origins in the adaptation of Hallidays framework to modes other than speech
and writing. Kress and van Leeuwens (1996) seminal work Reading Images: The Grammar of Visual
Design adapted Hallidays (1978) ideational, interpersonal and textual meta-functions for the
description of meaning made by images and their combined use with writing. They defined and
described the resources through which visual texts can (a) represent something about the world, (b)
represent something about their authors and addressees, and (c) shape cohesion, information structure
and different truth-values toward what is represented. Earlier, OToole (1994) had applied the three
meta-functions to the analysis of visual art, while, in later works, the three metafunctions have been
mapped onto the resources of speech, sound and music (van Leeuwen 1999), gesture and movement
(Martinec 2000), colour (Kress and van Leeuwen 2002), the moving image, or kineikonic mode (Burn
2013; Burn and Parker 2003), and layout (Kress 2010). In more recent work, Bezemer and Kress
(2014) try to map the three metafunctions onto the meaning potential of signs made through touch, as
Since the early 2000s, the notions of mode and multimodality have become a growing focus of interest.
In Multimodal Discourse: The Modes and Media of Contemporary Communication (2001), Kress and
van Leeuwen draw the notion of mode from Hallidays (1978) distinction between speech and writing
in language and extend it to all resources for representation. Always using examples of texts mainly
combining images and writing, but abandoning frames and terminology tightly-bound to linguistic
aim to explore the common principles behind multimodal communication. We move away from
the idea that the different modes in multimodal texts have strictly bounded and framed specialist
tasks [...]. Instead we move towards a view of multimodality in which common semiotic
principles operate in and across different modes, and in which it is therefore quite possible for
semiotics appropriate to contemporary semiotic practice. (Kress and van Leeuwen 2001:2).
At that time, Kress and van Leeuwen still referred to a multimodal theory of communication
(2001:111). However, in more recent years, the increased interest in multimodality has seen researchers
approaching and developing multimodal analysis through different theoretical perspectives. Two of
these stem from Hallidays theories, one drawing from his social semiotic take (i.e., on his idea that
language is a resource shaped to express and establish social roles and values; see the discussion in the
next section), the other from his systemic functional grammar framework (i.e., on his idea that
As to the former, van Leeuwen (2005a) and Kress (2010), each with a distinctive focus, have further
elaborated on Hodge and Kress (1988) social semiotic theory. Social semiotics conceives of sign-
making as the expression of social processes; through a fine-grained qualitative analysis of usually
small samples of texts, social semiotics is interested in unveiling ideologies, social values, power roles,
and identities as expressed in texts, together with how individuals actively maintain, reinforce, contest
As to the latter, the works by OHalloran (2008) and Baldry and Thibault (2006), among others, apply
and develop Hallidays Systemic-Functional Grammar to multimodal texts, with a special interest and
focus on modes as systems for meaning-making rather or more than as the sign-makers
expressions of social processes; using (slightly) wider corpora, systemic functional multimodal
discourse analysis defined by OHalloran (2011:4) as a grammatical approach vs. social semiotics
contextual approach aims at developing frameworks, analytical tools, and descriptions of the
As an example of the different takes of the two approaches, given news reporting as the object of
analysis, social semiotic multimodal analysis could focus on a small sample of texts reporting on the
same news and would be interested in unveiling ideologies and discourses as differently represented
through the combined use of images and writing. This take could reveal different media outlets
interests and positioning towards power and the parties involved in the news event (see for example the
analysis of news representation of the Palestinian conflict in van Leeuwen and Jaworski 2002). Instead,
a systemic-functional analysis would be more concerned with mapping regularities in the functional
use of images vs. writing in a usually larger news dataset, thus investigating their structural relation
(e.g., theme or focus) in shaping discursive functions (see for example the analysis of the changing
Not only can these perspectives be seen as complementing each other (OHalloran 2011), but also, as
the field develops, boundaries between approaches become less clear-cut, while different takes arise in-
between. So, critical multimodal discourse analysis (Machin 2014; Machin 2007; Machin and Mayr
2012) combines critical discourse analysis (specifically Kress 1985; Fairclough 1989; Wodak 1989;
van Dijk 1991) and social semiotics for the investigation of naturalized ideologies as expressed through
the combined used of modes (especially in printed texts), whereas multimodal interactional analysis
(Scollon and Wong Scollon 2003; Norris 2004) focuses on how interactants make meaning through
action in different modes in face-to-face encounters. Others bring new theoretical perspectives in, such
as Forceville and Urios-Apparisis (2009) work, which uses Lakoff and Johnsons (1980) cognitive
linguistic framework (rather than Hallidayan functional linguistics) to explore metaphors as represented
At the same time, multimodal analysis is being approached by scholars in other fields. In 2013, a
special issue in the Journal of Pragmatics has been entirely devoted to Conversation Analytic Studies
effort, a special issue of Qualitative Research (Dicks et al. 2011) has discussed the potentials of the
investigation of several domains, with fields of applications spanning from museum exhibition designs
language contexts and learning has been studied in Royce (2007), Romero and Arvalo (2010) and
Pinnow (2011). Translation studies are devoting growing attention to the challenges that multimodal
texts pose to translators, especially, but not exclusively, in audio-visual translation; cf. in this regard,
(OSullivan and Jeffcote 2013), the work by Taylor (2004) on subtitling, and Borodo (2014) on the
translation of comics. The multimodality of corporate and business discourse has been investigated in
Maier (2008; 2011), Garzone (2009) and Campagna and Boggio (2009), among others. An edited
volume by Page (2010) explores the relations between multimodality and narrative. Within the context
of education and writing studies, Lemkes (1998) special issue on multimodality of Linguistics and
Education has initiated an increasingly rich strand of application of multimodality in the field (e.g.,
Unsworth 2008), further developed in works on literacy and communication in the classroom, as in
Jewitt (2005; 2006; 2008), on academic literacy (Archer 2006) and on writing (Archer and Breuer
2015).
Cross-cultural issues in (mainly non-Western) multimodal genres have been examined in Bowcher
(2012), while the relation between genre and multimodality has been explored in Bateman (2008),
Bateman, Delin and Henschel (2007) and Prior (2009). Multimodal works on digital texts are
particularly numerous; among these, Lemke (2002) provides a framework for the analysis of
hypermodality, Adami (2015) develops tools for the analysis of web interactivity, while a special issue
on multimodality in Text & Talk (Adami et al. 2014) discusses the redefining notion of text in digital
environments.
This brief and necessarily selective review cannot aim to provide a comprehensive account of the
increasingly numerous and diversified works in multimodality. Multimodality is an admittedly fluid
field of investigation and so are its key notions and working definitions (Jewitt 2014a). As a concept it
has attracted growing attention from different disciplines concerned with meaning, text and
diversified community, gathering scholars from increasingly different backgrounds, adding and
Together with different views, come also, unavoidably, an increased complexity in the geo-politics of
the field and a need of shared terms and agreed definitions that can set the ground for dialogue, debate
and exchange of ideas and findings. Along with a series of handbooks and edited volumes bringing
together different perspectives (Jewitt 2009b; Jewitt 2014b; Ventola and Guijarro 2009; Klug and
Stckl 2014; Norris and Maier 2014), a biannual International Conference on Multimodality (ICOM)
has reached its seventh edition in 2014 (7ICOM, Hong Kong, 11-13 June); the websites of the past
conferences can serve as a further reference to grasp the increasingly wider spectrum of studies in the
field.
The next section examines key concepts and notions of a social semiotic perspective to multimodal
analysis, which, in seeing sign-making as inherently social, is particularly concerned with the entexting
of social relations and offers a lens for looking at social phenomena through multimodal representation.
Social semiotics has been developed as a theory of multimodal sign-making in the works of Hodge and
Kress (1988), van Leeuwen (2005a) and Kress (2010), who have extended Hallidays socially-framed
view of language to all semiotic resources. Social semiotics draws on Hallidays assumption that
language is a product of social processes and that the resources of a language are shaped by the
functions that it has developed to satisfy the needs of peoples lives. Through their everyday acts of
sign-making, while exchanging meanings, speakers express social structure, affirm their social roles
and transmit their systems of values and knowledge. Grammar, as much as vocabulary, is a resource,
rather than a set of predefined rules, that speakers use creatively by making choices. Through choice,
speakers produce variation; variation expresses (affiliation and conflict with) social structures and
roles, along with systems of knowledge and values, i.e., power. Language in Halliday and all semiotic
resources in Hodge and Kress (1988) is a social process in two senses. In expressing social values
and structures, language reveals them and, at the same time, it constructs them, thus establishing social
relations and systems of knowledge and values every time it is used. Through choice at all levels (or
strata), language expresses larger relations of power existing within society and constructs power roles
Hodge and Kress follow Halliday in assuming the primacy of the social dimension in understanding
behavioural and other codes, that a concentration on words alone is not enough. (1988:vii)
Therefore, no single code can be successfully studied or fully understood in isolation (1988:vii) and
thus social semiotics is conceived as a theory of all sign systems as socially constituted, and treated as
While mainstream semiotics emphasizes structures and codes, at the expense of functions and social
uses of semiotic systems (1988:1), social semiotics focuses on speakers and writers or other
participants in semiotic activity as connected and interacting in a variety of ways in concrete social
contexts (1988:1). It uses modes as analytical tools to investigate the ways in which societies have
shaped their semiotic resources, and the social meanings made by sign-makers specific use of modes
in multimodal texts.
Rather than describing semiotic modes as though they have intrinsic characteristics and inherent
systematicities or laws, social semiotics focuses on how people regulate the use of semiotic
resources again, in the context of specific social practices and institutions, and in different
Social semiotic multimodal analysis draws on a series of key concepts; the following sub-sections
examine four of them, namely mode as socially-shaped, the motivated sign, the meaning potential of
Social semiotics approach to modes is socially specific. What constitutes a mode depends on the social
group who uses it and the range of meanings that the group can express through its resources. For wine
tasters, wine is a fully articulated mode, with colour, aroma and taste as its modal resources, which can
express all three meta-functions; they can represent something about the world (the type of soil where
the vine was grown, the level of maturation of the grapes, or any defects in the wine making process,
such as oxidation, for example); they can tell something about the participants (wine preferences in
aroma and flavours are associated to identity features, e.g., adjectives such as feminine or
masculine, gentle or harsh, in the vocabulary of wine tasting); they can construct cohesion and
vary information structure in their combined use (if aroma is long but taste is short, wine is considered
unbalanced, the same if aroma is fruity while taste is markedly mineral, for example). One could argue
that colour, aroma and taste in wine are rather indexes, in the sense that their presence is given a certain
meaning by the interpreter while no sign-maker has intentionally produced them; however in the wine
business, wine-making experts increasingly coordinate all phases of the vine growth, grape harvest and
making of the wine to achieve the designed colour/taste/aroma. Furthermore, given that in mass-
production societies, sign-makers increasingly make meaning through selection rather than production
from scratch (as when designing the style of their homes through ready-made pieces of furniture, for
example), the meaning potentials of wine are fully in force as signs when a wine-connoisseur selects a
given wine for his/her guests. When I am offered a fruity and flowery glass of wine by a (usually male)
wine bartender, I cannot help but interpret his choice as a gendered sign which addresses me as a
woman through (stereo)typical feminine aroma/taste preferences for wine. In this case, as with all
other modes, larger relations of power are always at work in individual choices of meaning-making.
Not only in the bartenders, but also in my response to his stereotypically-gendered suggestion. If I
followed his suggestion, that would reinforce power role distributions in terms of gender preferences as
expressed in the mode of wine-tasting. If, instead, I opted for a particularly dry or markedly mineral
(rather than flowery) white wine, or a tobacco-and-leather smelling (rather than fruity) red wine, these
choices would reveal my interest in disassociating myself (and my identity) with a dominant
distribution in gender roles. Also in this second option, although contesting my belonging to a certain
gendered stereotype, I cannot escape the expression of power in my sign-making through wine choices.
the social and identity values expressed through wine tasting as mode, i.e., although I am a woman, I
am a wine connoisseur because my preferences align with males rather than womens. In this sense, a
social semiotic analysis of the resources of colour, aroma and taste in wine (as for the resources of any
other mode) can reveal (a) how societies have shaped them to express power, (b) how individuals
position themselves toward that established system of values, and (c) how systems of values might be
changing as affected by and revealed through wider changes in the individuals modal choices.
As a modal resource of writing, font has had a wide range of meaning potentials in typography since
the advent of print. Now, with the advent of digital technologies, it is increasingly widening its
meanings for all sign-makers, at a point where it might be considered a mode (rather than a resource of
writing) among increasingly numerous social groups. Font too has social meaning potentials; font types
can shape a text as addressing children or adults, as designed to look professional or amateur,
traditional or high-tech, and so on. In so doing, in each context a resource of font is used, it
In contemporary representation, sign-makers usually need to combine different modes in the same text,
as when students format a paper including layout, font, writing and graphs or when setting up a blog
on their personal and/or professional interests. Three related consequences motivate a socially-situated
1. multimodality is increasingly the normal state of communication (Kress and van Leeuwen
2001) and, hence, language-based tools for text analysis are less adequate to its description;
2. the potentials of modes and relations between modes are being shaped in new ways by everyday
acts of sign-making, through the increasing number and diversity of so-called user-generated
content;
3. awareness of the potentials of modes and their intertwined use is increasingly needed for
meaning-makers to interpret and unveil social meanings in texts, and for sign-makers to be
effective rhetors (Kress 2010) when producing their texts, i.e., to be able to assess which
resources in each mode are most apt to express their meaning to their addressed audience in
focusing of his/her social history and position (Kress 2010). Signs are socially-shaped resources that
are newly made every time they are used. In this regard, social semiotics take on sign-making is
influenced by Kress (1993; 1997) concept of the motivated sign. Against a Saussureans view of signs
as an arbitrary association between a form (signifier) and a meaning (signified), Kress motivated sign
stresses the motivation that can be traced in the relation between a sign-makers selection of a given
In the Saussurean structuralist tradition (rather or more than in Saussures original elaboration), the
positing of an arbitrary relation between signifier and signified has meant a focus on language as a
system (langue) and its driving forces, thus disregarding how individuals and social groups shape signs
(through individual acts of parole) and hence which systems of values and power drive their choices
in doing it. In contrast, in tracing the motivation between signifier and signified, multimodal analysts
can achieve insights into the sign-makers social, cultural and material context at the time of producing
the sign. The motivated association existing between a form and a meaning in a sign is crucial to
If the shape of the signifier aptly suggests the shape of the signified [], it allows an analyst
whether in everyday interaction or in research to hypothesize about the features which the
maker of the sign regarded as criterial about the object which she or he represented. Positing
that relation between sign and world is crucial [and] can lead to an understanding of the
sign-makers position in their world at the moment of the making of the sign. Such a hypothesis
In this sense social semiotic multimodal analysis sees signs in a text as the material residues of the
sign-makers interest and social position at the time of his/her making of the sign.
As an example, the photos featuring on the BBC news website in the news feature titled Gaza-Israel
conflict: Why are civilians on the front lines?, dated 15th July 20141, deploy long-shots when portraying
explosions, destroyed houses, and Palestinians, shot as a crowd, affected by the Israeli air strikes in
Gaza. In turn, they deploy a closer shot of individual persons when portraying citizens in Israel
witnessing a rocket attack coming from Gaza. In a social semiotic perspective, distance of shot is a
motivated signifier for social distance (Kress and van Leeuwen 1996) between the reader/viewer and
the represented participants, thus inviting to a greater or lesser identification with them; the
humanising vs. anonymising effect respectively (Machin 2007, pp.118119). Through the resources of
shot and number of represented participants, these photos shape differently the relation with the
bipartisan civilians written in the news header, humanising and inviting readers/viewers
identification with some while anonymising and presenting others as a more distant reality. The
motivated association between the signifier and the signified in these signs reveals the news providers
standpoint towards that specific event, in line with other cases of media representation of the conflict in
the region (cf. the findings in van Leeuwen and Jaworski 2002).
Hence a detailed analysis of relations between motivated signs in images and writing (and any other
mode) can offer deeper insights on the meanings produced by a text, on the relations they shape with
Semiotic resources have meaning potentials deriving from their materiality and the history of their uses
in a given society. When a semiotic resource is used in representation, a sign is newly made. Every
1
http://www.bbc.com/news/world-middle-east-28252155 (Retrieved 15th July 2014).
time it is used, it undergoes a certain degree of transformation. Two principles drive transformation,
i.e., provenance and experiential meaning potential. Provenance, closely related to Barthes (1977)
The idea here is that we constantly import signs from other contexts (another era, social group,
culture) into the context in which we are now making a new sign, in order to signify ideas and
values which are associated with that other context by those who import the sign. (Kress and
As a banal example, the meaning of ketchup (the entity as a semiotic object, and, consequently, the
word naming it) in the Italian context (the national context of the author) is endowed with the meaning
component American, with all related values associated to American by the Italian culture,
generally speaking, and those of specific social groups within it, which might well differ in terms of
affect. This component is instead absent in the meaning of ketchup for a US-based sign-maker.
Whenever sign-makers use a semiotic resource to create a sign, they transform it by endowing it with
Experiential meaning potential is instead akin to Lakoff and Johnsons (1980) view of metaphor and it
condenses
the idea that signifiers have a meaning potential deriving from what it is we do when we
produce them, and from our ability to turn action into knowledge, to extend our practical
experience metaphorically, and to grasp similar extensions made by others. (Kress and van
As an example of experiential meaning potential, the sepia colour effect used, e.g., when editing
images with software tools available on Instagram, has come to have the meaning of past/old through
association with the experience we have of the particular (dis-)colouring process which printed
photographs undergo through time.
These two concepts provenance and experiential meaning potential are used in multimodal
analysis to derive meaning potentials of resources used in texts, by tracing the associated meanings
given to their uses in other contexts. This helps revealing manipulative uses of resources through
borrowing from previous uses in other contexts. For example, in videos, a fast moving frame and an
unstable focus are generally associated through experiential meaning potential to amateur
commercials to give a sense of authenticity to their advertisement and, metonymically, to the promoted
features of the product. The same can be said for the changing dress code (along with gesturing,
language and so on) by political figures. Contemporary (Western) political dress code, by borrowing
resources from informal and everyday fashion, through provenance, is increasingly shaping politicians
as peer-citizens and laypersons, in the attempt to shape a more informal, familial and closer relation
with voters (for the use of provenance and experiential meaning potential, see Adami 2014s
framework for the analysis of the aesthetic meaning potentials of layout, font, colour, images and
Being socially-situated, signs and sign-complexes embody power relations that are entexted in genres
and generic forms. Through genres, signs and sign-complexes project social positioning and identity
values onto those who design and produce them and onto those addressed by them. As an example, the
selfie, i.e., a self-portrait picture taken through a mobile device and shared online, is a recently born
digital genre arisen from technological affordances of mobile devices (their front camera and online
connectivity feature). It has received attention in the media to an extent that celebrities in the show
business are increasingly shooting selfies (as a particularly famous instance, see the selfie that a group
of celebrities have collectively taken during the 2014 Oscars ceremony).2 Started as a practice by lay
sign-makers online as the digital and online-shared form of old self-portrait photographs, when made
by a celebrity, the selfie communicates identity values of everyday person, who shoots his/her own
photos of his/herself and shares them online with his/her friends, rather than of a celebrity whose
pictures are taken by professional photographers and addressed to fans. Hence the celebrity selfie
practice can be seen as an indication of the increased social value attributed to informality and
horizontal power relations in the show business in particular and in Western societies in general
(illustrious cases of selfie involve politicians and other social elites). Revealing the social meaning
potential of a genre can offer insights onto broader social dynamics at force in society and can provide
sign- and meaning-makers with tools for critical interpretation. This includes the understanding that
identity features and social relations are designed and projected by the genre rather than lived or real
ones, as an analysis of the environment where the selfie was taken can show. In this sense, the act of
taking selfies by celebrities can be seen as a performance of peer-identity features enacted in front of
the media, as in the example of Eva Longoria and Melanie Griffith taking selfies at Taormina Film
Festival in 2014 surrounded by photographers and an audience taking photos of the selfie-event.3
Here again, a social semiotic take on genre contrasts (or can integrate) the focus of structuralist
traditions. In a social semiotic perspective, genre is never stable; rather, it is an ever-changing frame of
reference and orientation that enables sign-makers to shape and make meaning of social roles in a given
2
It can be viewed online on The Guardian, among other websites:
http://www.theguardian.com/media/2014/mar/07/oscars-selfie-most-retweeted-ever (Retrieved 15 July
2014).
3
Photos of the event can be viewed on the website of the local newspaper La Sicilia:
http://www.lasicilia.it/gallery/melanie-griffith-e-eva-longoria-raffica-di-
%E2%80%9Cselfie%E2%80%9D-al-teatro-antico (Retrieved 15 July 2014).
communicative event/text social roles which are themselves also always subject to change through
agentivity.
As one of the theoretical perspectives in multimodal analysis, social semiotics uses modes as units of
analysis to trace social values, positioning and identity features projected by a text onto its author and
to account for the social meanings of texts, providing a wider and more in-depth picture than traditional
discourse analysis focused solely on language. In focusing on how the meaning potentials of modal
resources are combined together in texts and in tracing the sign-makers interests in their motivated
making of sings, it provides tools that can reveal naturalized discourses, values and ideologies in the
Because of its unit of analysis and specific focus, the approach has certain methodological limitations
along with lines of investigations that are still unexplored. Analysis is necessarily carried out
qualitatively on small samples of texts; it is fine-grained, informed by the research question and can be
time consuming (for details on methods and steps of analysis, cf. Bezemer and Jewitt 2010).
Generalisations are often difficult to make and some (e.g., Bateman et al. 2004) have argued on the
As to visual texts, extant research so far has paid predominant attention to the resources of image and
on the relation between image and writing. In this regard, van Leeuwen (2008) advocates
becomes less central than the analysis of semiotic resources such as composition, movement
and colour, which are common to a range of semiotic modes including images, graphics,
typography, fashion, product design, exhibition design and architecture. (van Leeuwen
2008:130)
As to its scope, analysis of texts can reveal the how certain meanings are produced; it cannot say
how readers will interpret them nor the real intentions of producers (Machin and Mayr 2012: 10). In
this, the use of methodologies drawn from other disciplines, such as ethnographic research, or studies
Interdisciplinary work is increasingly sought for in social semiotic multimodal research. Its perspective
can offer other social sciences a fine-grained and empirically-based methodology for the analysis of
social meanings in multimodal texts; at the same time, it can draw from other social sciences broader
frames for the interpretations of larger social dynamics underlying the production of these meanings or
Multimodal analysis is well equipped to investigate texts, yet further work is needed to approach text-
Often oriented to finished and finite texts, multimodal analysis considers the complexity of texts
or representations as they are, and less frequently how it is that such constructs come about, or
how it is that they transmogrify as (part of larger) dynamic processes (Iedema 2003:30)
The focus on text can be limiting in another respect. In contemporary sign-making, texts and signs are
selected and recontextualized, re-used, re-purposed, and disseminated in different semiotic spaces;
looking at single texts might offer a limited point of observation. As Lemke suggests we need to
extend the usual repertory of analytical tools for critical multimedia analysis from those which look at
single works to those which look across transmedia clusters (2009:140). Transmedia text-production
and dissemination are often driven by corporations; hence Lemke advocates a move from analyses
which focus on the formal features of the media themselves, to ones which place the experience of
media within political economy and cultural ecology of identities, markets, and values (2009:140).
Also van Leeuwen (2008) stresses the need to focus on the technology, and the power, restrictions and
ideological frames that it imposes on sign-making, especially in reason of the increased use of pre-
designed software tools for text production offering pre-set templates and preferred options for sign-
a new emphasis on the discourses, practices and technologies that regulate the use of semiotic
resources, and on studying the take-up of semiotic resources by users in relation to these
technologies for public dissemination, can certainly be seen as a trend towards a democratization of
resources available to everyday sign-makers; however, the current multimodal landscape does not
escape broader social dynamics of power. Not only is technological development and what it affords
as preferred/dispreferred modal choices driven by the (huge) interests of corporations operating in the
field, but also access to and awareness of the meaning potential of modal resources is differently
distributed within societies, where broader power dynamics are always in place. In this sense,
multimodal analysis could combine interdisciplinarily with theories and approaches in the social
sciences to explore further the issue of access and provide broader socially-based frames for a critical
Critical interpretation is not the only concern of social semiotic multimodal analysis; in this regard
Kress (2000; 2010) has long stressed the need for a move from critique to design. Analogously to what
critical linguistics and critical discourse analysis have analysed for language, multimodal critical
discourse analysis (Machin and Mayr 2012) intends to reveal naturalized ideologies, social values,
power interests and manipulative uses of all modal resources, in texts combining more than one mode.
Social semiotics aims at going one step further. In Kress view, while critique was needed in a social
semiotic landscape which was stable and needed change, a fast-paced changing media landscape like
todays foregrounds design choices and options. In a time when social relations (and their semiotic
counterpart, i.e., genres) are fluid and texts are increasingly multimodal, when conventions are no
longer fixed and sign- and meaning-makers are everyday faced with a wide range of choices for
representation, a theory aimed to describe sign-making as a social practice needs to focus on the ways
in which sign-makers design their texts and meaning-makers design their forms of engagement with
them. When representation is not only conceived as a record of society but also as contributing to shape
it, the agency of sign-makers is foregrounded not only in their creative use of resources to express
meaning, but also in the potentials of these for (social) change. Design is hence a key aspect for future
The chapter has defined the concept of multimodality as a phenomenon of communication. It has
discussed the reasons of its increased use in linguistics and disciplines interested in meaning and text,
which however do not necessarily use methods of multimodal analysis. It has then reviewed the
growing field of multimodal studies, which adopt different theoretical perspectives for the analysis of
modes and their intertwined use in texts and communicative events. A social semiotic take to
multimodal analyses has then been presented in detail, by introducing selected key notions, before
mentioning the potentials and limitations of the approach, together with some directions for future
multimodal analysis is conceived and carried out. It sees human communication as the expression of
social processes and it sees it as intrinsically multimodal. With the assumption that the social is prior to
the semiotic, social semiotics frames the interpretation of multimodal representation and
communication with a special focus on sign- and meaning-makers. It conceives of signs as socially-
shaped resources that are newly made every time by sign-makers who, according to their interests,
associate in a motivated way selected criterial aspects of a form (the signifier) to selected criterial
aspects of the meaning (the signified) which they want to express. Every resource has potentials to
make meaning, derived from its materiality and the history of its social uses. Analogously, every mode
has affordances, deriving from its materiality and social histories. Being socially-situated, signs and
sign-complexes embody power relations that are entexted in genres and generic forms. Through genres,
signs and sign-complexes project social positioning and identities values onto those who design and
A social semiotic multimodal analysis of a text asks questions such as: Which modes are at work here?
What is their relative functional load? What is the motivated association of a given form to a given
meaning? Whose interests does it reveal? What identity features are projected on the texts author and
addressees? Who is given power/freedom? (e.g., readers/adreessees, in designing their own reading
path, or the author?) And what does this all indicate in terms of social relations, values and ideologies?
The use of a certain colour and colour palette or of a font type, like the selection of different modalities
in images (e.g. as photo-realistic or abstract), carries certain meanings that are socially shaped and vary
across cultures. That is, the use of all modal resources is principled and modal resources have meaning
potentials that are given by the history of their past uses. Even if not expressed explicitly, unlike it has
long been done for speech and writing in linguistic traditions, genre- context- society- and culture-
specific conventions do exist for the use of all modes. These are naturalized conventions, which stem
from regularities and variations in the past and present uses of a given modal resource.
From the overall multimodal orchestration of a webpage, its use of colour, layout of elements, fonts,
images and writing, we can intuitively tell whether it is designed to look professional or amateur,
whether it addresses children or adults, whether we are addressed as experts or as general public, or as
belonging to a specific social group, in terms of gender, age, education, profession and life-style. Yet
precisely because conventions of modal resources other than language are naturalized, as interpreters of
these texts, we have a lack of awareness on the social values of their meaning potentials.
Hence investigating the meaning potential of modal resources, together with developing analytical
tools which make these conventions explicit, can empower meaning- and sign-makers in their everyday
activity of interpreting, critiquing and designing texts that can effectively fulfill their rhetorical aims.
A social semiotic multimodal approach, then, always combines a two-fold focus on texts; it investigates
texts and representational practices as socially and culturally shaped; and it uses the investigation of
texts and representational practices as a means to achieve insights into society and social groups, into
the ways in which they shape power relations and their cultural values.
<h1> References:
Adami, E. (2014). Aesthetics in digital texts beyond writing and image: A social semiotic multimodal
framework. In A. Archer and E. Breuer (eds.) Multimodality in Writing. The state of the art in theory,
Adami, E. (2015). Whats in a click. A social semiotic framework for the multimodal analysis of
Adolphs, S. and Carter, R. (2007). Beyond the word. New challenges in analysing corpora of spoken
Allwood, J. (2008). Multimodal corpora. In A. Ldeling and M. Kyt (eds.) Corpus linguistics. An
Archer, A. and Breuer, E. (2015) (eds.). Multimodality in Writing. The state of the art in theory,
Baldry, A. and Thibault, P.J. (2006). Multimodal Transcription and Text Analysis. A Multimedia
Bateman, J. (2008). Multimodality and Genre: A Foundation for the Systematic Analysis of Multimodal
Bateman, J., Delin, J. and Henschel, R. (2004). Multimodality and empiricism: preparing for a corpus-
electronic newspapers. In T. D. Royce and W. Bowcher (eds.) New Directions in the Analysis of
Bezemer, J. and Jewitt, C. (2010). Multimodal Analysis: Key Issues. In L. Litosseliti (ed.) Research
Bezemer, J. and Kress, G. (2014). Touch: a resource for making meaning. Australian Journal of
Bowcher, W. (2012) (ed.). Multimodal Texts from Around the World. Cultural and Linguistic Insights
Burn, A. (2013). The Kineikonic Mode: Towards a Multimodal Approach to Moving Image Media
Campagna, S. and Boggio, C. (2009). Multimodal Business and Economics (Milano: LED).
Dicks, B., Flewitt, R., Lancaster, L. and Pahl, K. (2011) (eds.). Multimodality and ethnography:
Forceville, C. and Urios-Aparisi, E. (2009) (eds.). Multimodal Metaphor (Berlin: Mouton De Gruyter).
Leeuwen and C. Jewitt (eds.) The Routledge Handbook of Visual Analysis (London: Routledge): 157
182.
Halliday, M.A.K. (1978). Language as Social Semiotic: The Social Interpretation of Language and
Haugh, M. (2009). Designing a Multimodal Spoken Component of the Australian National Corpus. In
M. Haugh and et al. (eds.) Selected Proceedings of the 2008 HCSNet Workshop on Designing the
Herring, S.C. (2010). Web Content Analysis: Expanding the Paradigm. In J. Hunsinger, M. Allen,
and L. Klastrup (eds.) The International Handbook of Internet Research (New York: Springer): 233
249.
Herring, S.C. (2013). Discourse in Web 2.0: Familiar, Reconfigured, and Emergent. In D. Tannen and
A. M. Tester (Eds.) Georgetown University Round Table on Language and Linguistics 2011: Discourse
2.0: Language and new media (Washington, DC: Georgetown University Press). Prepublication
version:http://ella.slis.indiana.edu/~herring/GURT.2011.prepub.pdf
Jewitt, C. (2005). Multimodality, Reading, and Writing for the 21st Century. Discourse:
Jewitt, C. (2006). Technology, Literacy and Learning: A Multimodal Approach (London: Routledge).
Jewitt, C. (2009a). Different approaches to multimodality. In C. Jewitt (ed.) The Routledge Handbook
Jewitt, C. (2009b) (ed.). The Routledge Handbook of Multimodal Analysis, 1st Ed. (London:
Routledge).
Jewitt, C. (2014b) (ed.). The Routledge Handbook of Multimodal Analysis, 2nd Ed. (London:
Routledge).
Klug, N.M. and Stckl, H. (2014) (eds.). Language in Multimodal Contexts (Berlin: Mouton De
Gruyter).
Knox, J. (2009). Punctuating the home page: image as language in an online newspaper. Discourse &
Kress, G. (1993). Against Arbitrariness: The social production of the sign as a foundational issue in
Kress, G. (1997). Before Writing: Rethinking the Paths to Literacy (London: Routledge).
Kress, G. (2000). Design and transformation: New theories of meaning. In B. Cope and M. Kalantzis
Routledge).
Kress, G. and van Leeuwen, T. (1996). Reading Images. The Grammar of Visual Design (2006 2nd
Kress, G. and van Leeuwen, T. (2001). Multimodal Discourse: The Modes and Media of Contemporary
Kress, G. and van Leeuwen, T. (2002). Colour as a semiotic mode: notes for a grammar of colour.
Lakoff, G. and Johnson, M. (1980). Metaphors We Live By (Chicago: University of Chicago Press).
Lemke, J. (1998) (ed.). Language and Other Semiotic Systems in Education. Linguistics and
Machin, D. (2014). What is multimodal critical discourse studies?. Critical Discourse Studies 10(4):
347355.
Machin, D. and Mayr, A. (2012). How to Do Critical Discourse Analysis (London: Sage).
Routledge).
Norris, S. and Maier, C.D. (2014) (eds.). Interactions, Images and Texts. A Reader in Multimodality
ideational meaning using language and visual imagery. Visual Communication 7(4): 443475.
OHalloran, K. (2011). Multimodal Discourse Analysis. In K. Hyland and B. Paltridge (eds.)
OHalloran, K. and Smith, B.A. (2011). Multimodal Studies. In K. OHalloran and B.A. Smith (eds.)
OToole, M. (1994). The Language of Displayed Art (London: Leicester University Press).
Page, R. (2010) (ed.). New Perspectives on Narrative and Multimodality (London/ New York:
Routledge).
Pinnow, R.J. (2011). Ive got an idea: A social semiotic perspective on agency in the second
Prior, P. (2009). From Speech Genres to Mediated Multimodal Genre Systems: Bakhtin, Voloshinov,
and the Question of Writing. In C. Bazerman, A. Bonini, and D. Figueiredo (eds.) Genre in a
Changing World (West Lafayette: Parlor Press and The WAC Clearing House): 1734.
Romero, E.D. and Arvalo, C.M. (2010). Multimodality and listening comprehension: testing and
Royce and W. Bowcher (eds.) New Directions in the Analysis of Multimodal Discourse (New Jersey:
Erlbaum): 361390.
Scollon, R. and Wong Scollon, S. (2003). Discourse in Place: Language in the Material World
(London: Routledge).
Sindoni, M. (2013). Spoken and Written Discourse in Online Interactions. A Multimodal Approach
(London: Routledge).
Streeck, J., Goodwin, C. and LeBaron, C.D. (2011) (eds.). Embodied Interaction: Language and Body
Taylor, C. (2004). Multimodal Text Analysis and Subtitling. In E. Ventola, C. Cassily, and M.
Continuum).
Van Dijk, T.A. (1991). Racism and the Press (London: Routledge).
van Leeuwen, T. (2004). Ten Reasons Why Linguists Should Pay Attention to Visual
Communication. In P. Levine and R. Scollon (eds.) Discourse and Technology. Multimodal Discourse
van Leeuwen, T. (2006). Towards a semiotics of typography. Information Design Journal 14(2),
(139155).
van Leeuwen, T. (2008). New forms of writing , new visual competencies. Visual Studies 23(2): 130
135.
van Leeuwen, T. and Jaworski, A. (2002). The discourses of war photography: Photojournalistic
representations of the Palestinian-Israeli war. Journal of Language and Politics 1(2): 255275.
Ventola, E. and Guijarro, A.J.M. (2009) (eds.). The World Told and The World Shown: Multisemiotic
Wodak, R. (1989). Language, Power and Ideology: Studies in Political Discourse (Amsterdam: John
Benjamins).