Payne, Elinor, Olga Maxwell, Robert Fuchs and Yizhou Wang. 2023. Lexical Stress Perception in Indian
Englishes. Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS 2023), Prague.
LEXICAL STRESS PERCEPTION IN INDIAN ENGLISHES
Elinor Paynea, Olga Maxwellb, Robert Fuchsc and Yizhou Wangb
a
University of Oxford, bUniversity of Melbourne, cUniversity of Hamburg
elinor.payne@phon.ox.ac.uk, omaxwell@unimelb.edu.au, robert.fuchs@uni-hamburg.de,
yizhouw4@student.unimelb.edu.au
ABSTRACT
We report an experiment investigating the relative
weighting of acoustic cues (vowel quality, intensity,
duration and f0) in lexical stress perception in Indian
English (IndE), compared with Southern Standard
British English (SSBE). GLMM modelling of
responses shows both similarities (e.g. vowel quality
was by far the most important cue for both IndE and
SSBE) and differences (IndE listeners were less
sensitive to all cues except duration, and made least
use of f0).
Differentiating IndE participants according to L1
background (Indo-Aryan vs Dravidian), however,
reveals a finer-grained picture, with L1 Indo-Aryan
listeners exhibiting cue hierarchy and degree of cue
strength that are closer to SSBE listeners. For L1
Dravidian listeners, while vowel quality remains the
most important cue, the strength of this cue, and that
of intensity, are significantly lower than for L1 IndoAryan and SSBE listeners. At the same time, duration
ranks more highly for these listeners.
Keywords: Stress, perception, Indian English, SSBE,
acoustic cues.
1. INTRODUCTION
The perception of stress is complex and subject to
cross-linguistic variation that remains poorly
understood. English is considered to be a stress
language, with lexical stress cued by a combination
of acoustic parameters that make stressed syllables
more salient to listeners [1]. While the main acoustic
cues to stress in SSBE and General American are
reported as being higher f0, longer duration, greater
intensity and full vowel quality [e.g. 2], there is little
consensus over relative cue weighting and
interaction. Some [e.g. 3] claim relative pitch to be
dominant, while others assert its unreliability [4], or
propose duration instead [5]. Furthermore, findings
for one variety of English are not necessarily
generalisable to another, as has been found for e.g.
Welsh English [6].
Previous limited research suggests that speakers of
new (‘non-settler’) Englishes spoken in multilingual
societies, e.g. Singapore or Indian English, may lack
a robust distinction between stressed and unstressed
syllables, and/or may employ different acoustic cues,
due to the influence of other languages in sustained
historic contact during the development of these
varieties. In addition, for many speakers, even with
native-like command, English may not be the
predominant or first language, and thus there may
also be ongoing contact influence from another
language. With regard to L2 perception, extensive
research has shown that the phonological system of
the L1, acquired very early in life [7], critically shapes
the perception of sounds in L2, and the influence is
greater the later the acquisition of L2; for example,
segmental contrasts that do not exist in L1 are
difficult for adult learners to acquire [8,9].
L1-mediated perceptual bias also affects prosody.
Research has shown that adult listeners’ ability to
perceive and learn lexical stress is closely linked to
the mechanisms used to signal stress in their L1 [10].
In a study on the influence of typologically diverse
L1s (English, Russian and Mandarin) in the
perception of lexical stress in English, Chrabaszcz et
al. [11] found that while vowel quality was the
strongest cue for all groups, pitch was the second
strongest cue for English and Mandarin listeners, but
virtually disregarded by Russian listeners who, in
turn, used duration and intensity cues to a greater
extent. This is despite Russian being a stress
language, like English, and unlike Mandarin, which
is a tonal language.
1.1 Indian English
‘Indian English’ (IndE) refers to the variety(/ies) of
English used by speakers in India and by some in the
global Indian diaspora. English is one of the official
languages of India and is used in a wide range of
domains (e.g. the media, education, business,
government, literary writing), but coexists with a
large range of diverse Indian languages. IndE is
usually an L2, with the vast majority of speakers in
India also speaking at least one and often two or three
Indian languages. In the 2011 census [12], around
10% of inhabitants declared English as their L2 or L3,
and a tiny minority (0.02%) as their L1.
Lexical stress in IndE is of particular interest
because most, if not all, Indian languages lack
contrastive lexical stress [e.g. 13], and any word-level
prominences are typically determined by structural
properties such as syllable structure or position in the
word [14]). Previous research on IndE phonological
and phonetic features more generally indicates strong
evidence for the influence of indigenous languages
[e.g. 15,16,17,18], leading to identifiable accents of
IndE, either through historical influence or from
continued contact in a multilingual society. However,
pan-Indian features also arise, through serendipitous
cross-L1 commonalities (‘areal features’, see [1920]), or through standardisation and homogenisation
of IndE [21] due in part to its ‘self-replicating nature’,
where English is taught to Indians by Indians [22].
Hence, the question arises as to whether and how this
complex multilingual backdrop shapes the perception
of lexical stress in IndE.
A further factor in the strength of ongoing L1
influence concerns educational background, known
to play a significant role in distinguishing basilectal
and acrolectal speakers of IndE [23]. While IndE is
“primarily being transmitted through school” [17],
the Indian education system comprises different types
of schools providing varying levels of exposure to
varying kinds of English. Government schools
(especially in remote areas) are less likely to be
English-medium, and more likely to have teachers
who are either less fluent in English, or whose IndE
is more influenced by a local or state language. By
contrast, ‘public’ (private) schools are more likely to
be English-medium, provide full immersion in
English, and with teachers more proficient in English.
There has also been a rapid rise in the number of
international schools, with curricula closely modelled
on British or US educational systems, some with
British or US English-speaking teachers.
1.2 Stress/prominence in Indian L1s
The notion of ‘stress’ is an elusive phenomenon in
South Asian languages, with speakers reporting
difficulty in its auditory identification with
disagreement over the presence and type of acoustic
cues, and over the location of prominence [e.g. 24].
Unlike English, all Indian languages lack lexically
contrastive stress, although prominences at the word
level do exist, and these are derived from languagespecific phonological and syllable structure
considerations, and appear to serve what may be
considered a post-lexical role, viz. the existence of a
smaller, word-like prosodic constituent in Indian
languages (and the higher tonal density that results).
Auditory studies have claimed that speakers transfer
these properties onto IndE, by accenting most words
(c.f. [25]), and in L1-specific ways. Investigating
phonetic cues in IndE speakers’ (with Hindi or
Telugu L1) production of accentual and focal
prominence, Moon [26] showed that neither IndE
groups relied on durational cues. However, while the
two groups used similar values for maximum f0, IndE
speakers with a Hindi L1 background showed greater
lowering of f0 at the beginning of the accented
syllable, possibly reflecting differences in pitch
accent type and/or alignment in their L1.
Given the controversial status of ‘stress’ in Indian
languages, and their very close contact with IndE, we
might reasonably expect L1 properties to shape
perceptual sensitivity of IndE listeners to lexical
stress cues. While we might expect this to be strongly
the case for multilingual speakers with IndE as L2, it
could also be true for those who learn IndE at a very
early age and have a ‘native’ command, since even if
the hypothesised differences historically arose
through contact, they could now be an intrinsic
property of IndE, and thus be present even in
monolingual speakers. We hypothesise lower
sensitivity to stress cues in general, compared to
SSBE listeners, and, following [26]’s results, that
duration play little to no role at all in stress perception
by IndE listeners. Given variable usage of acoustic
cues across L1s for post-lexical prominences, and the
scant research on this area for Dravidian languages in
particular, potential differences in relative cue
weighting are harder to predict. This paper reports a
preliminary study exploring these questions.
2. METHOD
2.1 Perceptual experiment
To investigate this, we used stimuli from Chrabaszcz
et al. [11], which were modified natural recordings of
the disyllabic nonword ‘maba’, produced by a male
speaker of North American English (see [11: 5] for a
description of how pitch, duration, intensity and
vowel quality were manipulated). Stimuli were varied
according to a 2-factorial manipulation for each
variable on each syllable, resulting in 256 unique
tokens.
Figure 2 Screen layout for experiment in PsychoPy.
All participants performed a forced-choice
auditory identification (using PsychoPy [27] and
hosted online using Pavlovia) of the location of the
stressed syllable by clicking on corresponding
buttons on a computer screen with a computer mouse
(see Fig. 2.1), a selection that also indicated degree of
confidence. Reaction times (RTs) were also recorded.
In addition, participants received 16 practice trials to
familiarise themselves with the procedure.
Participants were instructed to use headphones for
completing the task.
Participants consisted of 44 IndE speakers,
resident in India, and 29 SSBE speakers, resident in
the UK. Due to COVID-restrictions, the experiment
was carried out online. Participants were aged 18-34
years and either currently in higher education (HE),
or already in possession of an HE qualification. The
SSBE cohort were all monolingual speakers of
English (with varying foreign language knowledge
from school/university). The IndE cohort were all
highly proficient in English, and chosen on the basis
of their L1s representing two major Indo-Aryan
languages, Bengali (n=13) and Hindi (n=10), and four
major Dravidian languages, Telugu (n=8),
Malayalam (n=5), Tamil (n=4), and Kannada (n=2).
We additionally collected detailed information on
the Indian participants’ sociolinguistic background,
including medium of instruction in school, onset of
exposure to English during childhood, and school
type (state, private). Space constraints prevent us
from exploring all these variables here. However,
based on previous research and our initial analysis of
the data, we focus here on L1 background and explore
the difference between IndE speakers with Dravidian
and Indo-Aryan L1s.
2.2. Analysis
After inspecting the original dataset, we removed one
IndE participant who responded randomly during the
task, and any trials with a very short (< 100 ms) or
very long (> 10,000 ms) RT. This left us with 96.2%
of the total trials. The perception data were analysed
using generalised linear mixed-effects models
(binomial GLMM) with response (first vs second
syllable stressed) as dependent variable, acoustic
cues, i.e. vowel, intensity, duration, and pitch, and
language background (Model 1: SSBE vs IndE;
Model 2: SSBE vs IndE Indo-Aryan vs IndE
Dravidian) as fixed factors. Specific trials and
individual participants were kept as random factors.
Subsequently, we interpreted the coefficients of
acoustic cues as indicators of the relative weighting
of these cues in stress perception.
3. RESULTS
3.1 Indian English and British English
In the first model, we contrasted IndE speakers with
SSBE speakers. Results indicate that vowel quality is
by far the most important acoustic cue for speakers of
both dialects and that IndE listeners are significantly
less sensitive to all acoustic cues except duration,
when compared to SSBE listeners (p’s < .001). As a
consequence, the cue hierarchies for stress perception
differ to some extent between the two groups,
principally due to a lower ranking of pitch for IndE
listeners. Specifically, SSBE listeners followed a cue
hierarchy of vowel > intensity > pitch > duration,
while IndE listeners followed a somewhat different
hierarchy of vowel > intensity/duration > pitch (see
Fig. 3.1 for coefficients derived from the GLMM).
Figure 3.1: Cue-weighting in stress perception in IndE
and SSBE participants (with standard errors).
3.2 Indian English L1 Background
To further investigate the influence of L1
background, we differentiated the IndE participants
based on their L1 language family (Indo-Aryan vs
Dravidian) and contrasted these two groups again
with SSBE, using a GLMM. While vowel quality still
emerges as the most important cue for all groups,
results indicate that L1 language family background
influences both cue hierarchy and cue strength in
stress perception. The cue hierarchy is vowel quality
> duration > pitch/intensity for L1 speakers of
Dravidian languages, and vowel quality > intensity >
duration (and no use of pitch) for L1 speakers of IndoAryan languages - see Fig. 3.2 for coefficients
derived from the GLMM.
Furthermore, differences in cue strength are
significantly influenced by background L1 family:
cue strength for both vowel quality and intensity is
significantly lower for IndE listeners with Dravidian
L1s compared to both the Indo-Aryan and the SSBE
groups (p<0.001), with no significant difference
between the latter two groups. By contrast, while cue
strength for duration is somewhat higher for the IndE
Dravidian compared to the Indo-Aryan group, and the
same holds for pitch, these differences do not reach
statistical significance. In sum, analysis by
background L1 family indicates that acoustic cues to
stress perception for IndE listeners with Indo-Aryan
L1s are more similar to SSBE listeners than to IndE
listeners with Dravidian L1s.
Figure 3.2: Cue-weighting in stress perception in SSBE,
Indo-Aryan and Dravidian IndE participants (with
standard errors).
4. DISCUSSION AND CONCLUSION
The results in part confirm our hypothesis that IndE
speakers would show a different perceptual response
to acoustic cues to lexical stress in English from
speakers of SSBE. Though on first inspection, IndE
speakers as a group appeared to be less sensitive to all
cues except duration, further analysis distinguishing
IndE speakers based on L1 revealed that L1 IndoAryan speakers actually behave very similarly to
SSBE speakers, with regard to vowel quality,
intensity and duration, and cue hierarchy, and differ
only for pitch. In contrast, Dravidian speakers of IndE
showed a clearly different perceptual response from
SSBE speakers, with significantly less sensitivity to
vowel quality and intensity differences, and a
different cue hierarchy. At the same time, Dravidian
speakers also showed a slightly (but not significantly)
higher sensitivity to duration, and therefore a
different cue hierarchy.
One possible explanation for these differences
between IndE cohorts is that they are attributable to
differences in L1 structural or phonetic properties
between the two language families. The limited
research on such properties across the relevant
languages suggests varying uses of cues to mark postlexical prominences, but without any obvious robust
alignment according to language family. For
example, in Telugu, Malayalam and Kannada,
phonetic cues of non-lexical prominences are
reported to include vowel duration [28, 29, 30, 31].
However, [24] reports Telugu listeners to be highly
variable in their judgments, and for Malayalam, [32]
reports listeners perceived stress to fall on the first
syllable regardless of vowel duration. Furthermore,
there are reportedly no reliable durational cues to
prominence placement in Tamil [13].
The use of a particular acoustic cue for unrelated
structural properties may be a relevant factor.
Notably, only Dravidian languages have lexically
contrastive vowel duration, and thus Dravidian
speakers may already be more attuned to differences
in vowel duration. This could explain the different
positioning of duration in their cue hierarchy. With
this in mind, a further step would be to differentiate
L1 Hindi and Bengali speakers too. While neither
language has contrastive vowel duration, there is
potentially a predisposition in Hindi to associate a
shorter vowel duration and more reduced quality with
a lack of stress (at least in open syllables). In contrast,
Bengali has no schwa and no indirect weak
correlation between stress and vowel duration.
A second possibility is that these differences
reflect variation in the participant profile for the two
language family cohorts. Initial exploration of
background variables reveals a small but significant
difference in the onset of their learning English, with
the Dravidian speakers on average starting at a
slightly later age. This may explain the relative
weakness of cue sensitivity in this cohort, compared
with the Indo-Aryan speakers.
Besides these differences between the two IndE
groups, a couple of similarities are noteworthy.
Firstly, both groups clearly show much greater
sensitivity to vowel quality cues than to any other cue,
and this is also true for SSBE. Secondly, for both
groups pitch ranks as the weakest (or, for Dravidian
speakers, the joint weakest) cue, in contrast with the
SSBE cohort, for whom it is weak but not the
weakest. One possible explanatory avenue to explore
might lie in pan-L1 interference from the
preponderance of pitch accents in Indian languages
(higher tonal density, low pitch word-initially).
In addition to further exploration of background
variables, a next step will be to analyse production
data for the same participants and to investigate
whether these cue strengths and hierarchies are
replicated in their own speech (for example the use of
vowel reduction to signal prosodic prominence).
5. REFERENCES
[1] Culter, A. 2005. Lexical stress. In: Pisoni, D., Remez,
R. (eds) The Handbook of Speech Perception. NY,
USA: Blackwell Publishing, 264-289.
[2] Lehiste, I. 1970. Suprasegmentals. Cambridge MIT
Press.
[3] Beckman, M. 1986. Stress and non-stress accent.
Dordrecht: Foris.
[4] Sluijter, A., van Heuven, V. 1996. Acoustic correlates
of linguistic stress and accent in Dutch and American
English. Proceedings of Proceedings of the
International Congress of Spoken Language
Processing, Philadelphia, PA, 630-633.
[5] Okobi, A. 2006. Acoustic correlates of word stress in
American English. Unpublished PhD dissertation,
MIT.
[6] Mennen, I., Kelly, N., Mayr, R., Morris, J. 2020. The
effects of home language and bilingualism on the
realization of lexical stress in Welsh and Welsh
English.
Front
Psychol.
Jan
22.
Doi:
10.3389/fpsyg.2019.03038.
[7] Werker, J., Tees, R. 1984. Cross-language speech
perception: Evidence for perceptual reorganization
during the first year of life. Infant Behavior &
Development 7(1), 49–63.
[8] Best, C., Tyler, M. 2007. Nonnative and secondlanguage speech perception: Commonalities and
complementarities. In: Bohn, O.-S., Munro, M. (eds),
Language experience in second language speech
learning: In honour of James Emil Flege. John
Benjamins, 13–34.
[9] Flege, J., MacKay, I. 2004. Perceiving vowels in a
second language. Studies in Second Language
Acquisition 26, 1-34.
[10] Tremblay, A. 2021. The past, present, and future of
stress in second-language word production and
recognition. In: Wayland, R. (ed) Second language
speech learning: Theoretical and empirical progress.
Cambridge: Cambridge University Press, 175–192.
[11] Chrabaszcz, A., Winn, M., Lin, C. Y., Idsardi, W. J.
2014. Acoustic cues to perception of word stress by
English, Mandarin, and Russian speakers. J. Speech.
Lang. Hear. Res. 57, 1468–1479.
[12] Census of India. 2011. Registrar General and Census
Commissioner of India.
[13] Keane, E. 2006. Prominence in Tamil. Journal of the
International Phonetic Association, 36(1), 1-20.
[14] Mehrotra, R. C. 1965. Stress in Hindi, Indian
Linguistics, 26, 96-105
[15] Bansal, R.K. 1970. A phonetic analysis of English
spoken by a group of well-educated speakers from
Uttar-Pradesh. CIEFL Bulletin (Hyderabad) 8, 1-11.
[16] Wiltshire, C. 2005. The “Indian English” of TibetoBurman language speakers. English World-Wide
26(3), 275–300.
[17] Wiltshire, C. 2020. Uniformity and Variability in the
Indian English Accent. (Elements in World
Englishes). Cambridge: Cambridge University Press.
[18] Maxwell, O., Fletcher, J. 2009. Acoustic and
durational properties of Indian English vowels. World
Englishes 28(1), 52-70.
[19] Masica, C. 2005. Defining a linguistic area: South
Asia. New Delhi: Chronicle Books.
[20] Khan, S. D. 2016. The intonation of South Asian
languages. FASAL-6, Umass Amherst.
[21] Sirsa, H., Redford, M. 2013. The effects of native
language on Indian English sounds and timing
patterns. Journal of Phonetics 41(6), 393-406.
[22] Wiltshire, C., Harnsberger, J. 2006. The influence of
Gujarati and Tamil L1s on Indian English: A
preliminary study. World Englishes 25(1), 91-104.
[23] Pandey, P. 2016. Indian English prosody. In: Leitner,
G., Hashim, A., Wolf, H.-G. (eds) Communicating
with Asia, CUP: Cambridge, 56-68.
[24] Lisker, L., Krishnamurti, Bh. 1991. Lexical stress in a
'stressless' language: judgments by Telugu- and
English-speaking
linguists.
Proceedings
of
ICPhS1991, Université de Provence, 90–93.
[25] Gargesh, R. 2004. Indian English: Phonology. In:
Schneider, E., Burridge, K., Kortmann, B., Mesthrie,
R., Upton, C. (eds) A handbook of varieties of English:
A multimedia reference tool (Vol. 1). Berlin,
Germany: Mouton de Gruyter, 992-1002.
[26] Moon, R. 2002. A comparison of the acoustic
correlates of focus in Indian English and American
English. Unpublished Master’s thesis, University of
Florida.
[27] Peirce, J. 2009. Generating stimuli for neuroscience
using PsychoPy. Front. Neuroinform. 2, 10.
[28] Balusu, R. 2001. Acoustic correlates of stress and
accent in Telugu. 21st South Asian Languages
Analysis Roundtable, University of Konstanz, 7-19.
[29] Asher, R. E. & Kumari, T. C., 1997. Malayalam,
Routledge
Descriptive
Grammars,
London:
Routledge.
[30] Narayan, S., Mahesh, S., Veeramani, P. 2019.
Acoustic cue for stress perception in Kannada
speaking children. Research and Reviews: Journal of
Neuroscience 9(3), 16-23.
[31] Savithri S.R. 1999. Perception of word stress.
Proceedings of the Madras India Regional Conference
of the Acoustical Society of America (2). 110–113.
[32] Terzenbach, L. 2011. Malayalam prominence and
vowel duration: listener acceptability. Unpublished
Masters dissertation, UT, Austin.