Academia.eduAcademia.edu

Lexical Stress Perception in Indian Englishes

2023, Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS 2023)

We report an experiment investigating the relative weighting of acoustic cues (vowel quality, intensity, duration and f0) in lexical stress perception in Indian English (IndE), compared with Southern Standard British English (SSBE). GLMM modelling of responses shows both similarities (e.g. vowel quality was by far the most important cue for both IndE and SSBE) and differences (IndE listeners were less sensitive to all cues except duration, and made least use of f0). Differentiating IndE participants according to L1 background (Indo-Aryan vs Dravidian), however, reveals a finer-grained picture, with L1 Indo-Aryan listeners exhibiting cue hierarchy and degree of cue strength that are closer to SSBE listeners. For L1 Dravidian listeners, while vowel quality remains the most important cue, the strength of this cue, and that of intensity, are significantly lower than for L1 Indo-Aryan and SSBE listeners. At the same time, duration ranks more highly for these listeners.

Payne, Elinor, Olga Maxwell, Robert Fuchs and Yizhou Wang. 2023. Lexical Stress Perception in Indian Englishes. Proceedings of the 20th International Congress of Phonetic Sciences (ICPhS 2023), Prague. LEXICAL STRESS PERCEPTION IN INDIAN ENGLISHES Elinor Paynea, Olga Maxwellb, Robert Fuchsc and Yizhou Wangb a University of Oxford, bUniversity of Melbourne, cUniversity of Hamburg elinor.payne@phon.ox.ac.uk, omaxwell@unimelb.edu.au, robert.fuchs@uni-hamburg.de, yizhouw4@student.unimelb.edu.au ABSTRACT We report an experiment investigating the relative weighting of acoustic cues (vowel quality, intensity, duration and f0) in lexical stress perception in Indian English (IndE), compared with Southern Standard British English (SSBE). GLMM modelling of responses shows both similarities (e.g. vowel quality was by far the most important cue for both IndE and SSBE) and differences (IndE listeners were less sensitive to all cues except duration, and made least use of f0). Differentiating IndE participants according to L1 background (Indo-Aryan vs Dravidian), however, reveals a finer-grained picture, with L1 Indo-Aryan listeners exhibiting cue hierarchy and degree of cue strength that are closer to SSBE listeners. For L1 Dravidian listeners, while vowel quality remains the most important cue, the strength of this cue, and that of intensity, are significantly lower than for L1 IndoAryan and SSBE listeners. At the same time, duration ranks more highly for these listeners. Keywords: Stress, perception, Indian English, SSBE, acoustic cues. 1. INTRODUCTION The perception of stress is complex and subject to cross-linguistic variation that remains poorly understood. English is considered to be a stress language, with lexical stress cued by a combination of acoustic parameters that make stressed syllables more salient to listeners [1]. While the main acoustic cues to stress in SSBE and General American are reported as being higher f0, longer duration, greater intensity and full vowel quality [e.g. 2], there is little consensus over relative cue weighting and interaction. Some [e.g. 3] claim relative pitch to be dominant, while others assert its unreliability [4], or propose duration instead [5]. Furthermore, findings for one variety of English are not necessarily generalisable to another, as has been found for e.g. Welsh English [6]. Previous limited research suggests that speakers of new (‘non-settler’) Englishes spoken in multilingual societies, e.g. Singapore or Indian English, may lack a robust distinction between stressed and unstressed syllables, and/or may employ different acoustic cues, due to the influence of other languages in sustained historic contact during the development of these varieties. In addition, for many speakers, even with native-like command, English may not be the predominant or first language, and thus there may also be ongoing contact influence from another language. With regard to L2 perception, extensive research has shown that the phonological system of the L1, acquired very early in life [7], critically shapes the perception of sounds in L2, and the influence is greater the later the acquisition of L2; for example, segmental contrasts that do not exist in L1 are difficult for adult learners to acquire [8,9]. L1-mediated perceptual bias also affects prosody. Research has shown that adult listeners’ ability to perceive and learn lexical stress is closely linked to the mechanisms used to signal stress in their L1 [10]. In a study on the influence of typologically diverse L1s (English, Russian and Mandarin) in the perception of lexical stress in English, Chrabaszcz et al. [11] found that while vowel quality was the strongest cue for all groups, pitch was the second strongest cue for English and Mandarin listeners, but virtually disregarded by Russian listeners who, in turn, used duration and intensity cues to a greater extent. This is despite Russian being a stress language, like English, and unlike Mandarin, which is a tonal language. 1.1 Indian English ‘Indian English’ (IndE) refers to the variety(/ies) of English used by speakers in India and by some in the global Indian diaspora. English is one of the official languages of India and is used in a wide range of domains (e.g. the media, education, business, government, literary writing), but coexists with a large range of diverse Indian languages. IndE is usually an L2, with the vast majority of speakers in India also speaking at least one and often two or three Indian languages. In the 2011 census [12], around 10% of inhabitants declared English as their L2 or L3, and a tiny minority (0.02%) as their L1. Lexical stress in IndE is of particular interest because most, if not all, Indian languages lack contrastive lexical stress [e.g. 13], and any word-level prominences are typically determined by structural properties such as syllable structure or position in the word [14]). Previous research on IndE phonological and phonetic features more generally indicates strong evidence for the influence of indigenous languages [e.g. 15,16,17,18], leading to identifiable accents of IndE, either through historical influence or from continued contact in a multilingual society. However, pan-Indian features also arise, through serendipitous cross-L1 commonalities (‘areal features’, see [1920]), or through standardisation and homogenisation of IndE [21] due in part to its ‘self-replicating nature’, where English is taught to Indians by Indians [22]. Hence, the question arises as to whether and how this complex multilingual backdrop shapes the perception of lexical stress in IndE. A further factor in the strength of ongoing L1 influence concerns educational background, known to play a significant role in distinguishing basilectal and acrolectal speakers of IndE [23]. While IndE is “primarily being transmitted through school” [17], the Indian education system comprises different types of schools providing varying levels of exposure to varying kinds of English. Government schools (especially in remote areas) are less likely to be English-medium, and more likely to have teachers who are either less fluent in English, or whose IndE is more influenced by a local or state language. By contrast, ‘public’ (private) schools are more likely to be English-medium, provide full immersion in English, and with teachers more proficient in English. There has also been a rapid rise in the number of international schools, with curricula closely modelled on British or US educational systems, some with British or US English-speaking teachers. 1.2 Stress/prominence in Indian L1s The notion of ‘stress’ is an elusive phenomenon in South Asian languages, with speakers reporting difficulty in its auditory identification with disagreement over the presence and type of acoustic cues, and over the location of prominence [e.g. 24]. Unlike English, all Indian languages lack lexically contrastive stress, although prominences at the word level do exist, and these are derived from languagespecific phonological and syllable structure considerations, and appear to serve what may be considered a post-lexical role, viz. the existence of a smaller, word-like prosodic constituent in Indian languages (and the higher tonal density that results). Auditory studies have claimed that speakers transfer these properties onto IndE, by accenting most words (c.f. [25]), and in L1-specific ways. Investigating phonetic cues in IndE speakers’ (with Hindi or Telugu L1) production of accentual and focal prominence, Moon [26] showed that neither IndE groups relied on durational cues. However, while the two groups used similar values for maximum f0, IndE speakers with a Hindi L1 background showed greater lowering of f0 at the beginning of the accented syllable, possibly reflecting differences in pitch accent type and/or alignment in their L1. Given the controversial status of ‘stress’ in Indian languages, and their very close contact with IndE, we might reasonably expect L1 properties to shape perceptual sensitivity of IndE listeners to lexical stress cues. While we might expect this to be strongly the case for multilingual speakers with IndE as L2, it could also be true for those who learn IndE at a very early age and have a ‘native’ command, since even if the hypothesised differences historically arose through contact, they could now be an intrinsic property of IndE, and thus be present even in monolingual speakers. We hypothesise lower sensitivity to stress cues in general, compared to SSBE listeners, and, following [26]’s results, that duration play little to no role at all in stress perception by IndE listeners. Given variable usage of acoustic cues across L1s for post-lexical prominences, and the scant research on this area for Dravidian languages in particular, potential differences in relative cue weighting are harder to predict. This paper reports a preliminary study exploring these questions. 2. METHOD 2.1 Perceptual experiment To investigate this, we used stimuli from Chrabaszcz et al. [11], which were modified natural recordings of the disyllabic nonword ‘maba’, produced by a male speaker of North American English (see [11: 5] for a description of how pitch, duration, intensity and vowel quality were manipulated). Stimuli were varied according to a 2-factorial manipulation for each variable on each syllable, resulting in 256 unique tokens. Figure 2 Screen layout for experiment in PsychoPy. All participants performed a forced-choice auditory identification (using PsychoPy [27] and hosted online using Pavlovia) of the location of the stressed syllable by clicking on corresponding buttons on a computer screen with a computer mouse (see Fig. 2.1), a selection that also indicated degree of confidence. Reaction times (RTs) were also recorded. In addition, participants received 16 practice trials to familiarise themselves with the procedure. Participants were instructed to use headphones for completing the task. Participants consisted of 44 IndE speakers, resident in India, and 29 SSBE speakers, resident in the UK. Due to COVID-restrictions, the experiment was carried out online. Participants were aged 18-34 years and either currently in higher education (HE), or already in possession of an HE qualification. The SSBE cohort were all monolingual speakers of English (with varying foreign language knowledge from school/university). The IndE cohort were all highly proficient in English, and chosen on the basis of their L1s representing two major Indo-Aryan languages, Bengali (n=13) and Hindi (n=10), and four major Dravidian languages, Telugu (n=8), Malayalam (n=5), Tamil (n=4), and Kannada (n=2). We additionally collected detailed information on the Indian participants’ sociolinguistic background, including medium of instruction in school, onset of exposure to English during childhood, and school type (state, private). Space constraints prevent us from exploring all these variables here. However, based on previous research and our initial analysis of the data, we focus here on L1 background and explore the difference between IndE speakers with Dravidian and Indo-Aryan L1s. 2.2. Analysis After inspecting the original dataset, we removed one IndE participant who responded randomly during the task, and any trials with a very short (< 100 ms) or very long (> 10,000 ms) RT. This left us with 96.2% of the total trials. The perception data were analysed using generalised linear mixed-effects models (binomial GLMM) with response (first vs second syllable stressed) as dependent variable, acoustic cues, i.e. vowel, intensity, duration, and pitch, and language background (Model 1: SSBE vs IndE; Model 2: SSBE vs IndE Indo-Aryan vs IndE Dravidian) as fixed factors. Specific trials and individual participants were kept as random factors. Subsequently, we interpreted the coefficients of acoustic cues as indicators of the relative weighting of these cues in stress perception. 3. RESULTS 3.1 Indian English and British English In the first model, we contrasted IndE speakers with SSBE speakers. Results indicate that vowel quality is by far the most important acoustic cue for speakers of both dialects and that IndE listeners are significantly less sensitive to all acoustic cues except duration, when compared to SSBE listeners (p’s < .001). As a consequence, the cue hierarchies for stress perception differ to some extent between the two groups, principally due to a lower ranking of pitch for IndE listeners. Specifically, SSBE listeners followed a cue hierarchy of vowel > intensity > pitch > duration, while IndE listeners followed a somewhat different hierarchy of vowel > intensity/duration > pitch (see Fig. 3.1 for coefficients derived from the GLMM). Figure 3.1: Cue-weighting in stress perception in IndE and SSBE participants (with standard errors). 3.2 Indian English L1 Background To further investigate the influence of L1 background, we differentiated the IndE participants based on their L1 language family (Indo-Aryan vs Dravidian) and contrasted these two groups again with SSBE, using a GLMM. While vowel quality still emerges as the most important cue for all groups, results indicate that L1 language family background influences both cue hierarchy and cue strength in stress perception. The cue hierarchy is vowel quality > duration > pitch/intensity for L1 speakers of Dravidian languages, and vowel quality > intensity > duration (and no use of pitch) for L1 speakers of IndoAryan languages - see Fig. 3.2 for coefficients derived from the GLMM. Furthermore, differences in cue strength are significantly influenced by background L1 family: cue strength for both vowel quality and intensity is significantly lower for IndE listeners with Dravidian L1s compared to both the Indo-Aryan and the SSBE groups (p<0.001), with no significant difference between the latter two groups. By contrast, while cue strength for duration is somewhat higher for the IndE Dravidian compared to the Indo-Aryan group, and the same holds for pitch, these differences do not reach statistical significance. In sum, analysis by background L1 family indicates that acoustic cues to stress perception for IndE listeners with Indo-Aryan L1s are more similar to SSBE listeners than to IndE listeners with Dravidian L1s. Figure 3.2: Cue-weighting in stress perception in SSBE, Indo-Aryan and Dravidian IndE participants (with standard errors). 4. DISCUSSION AND CONCLUSION The results in part confirm our hypothesis that IndE speakers would show a different perceptual response to acoustic cues to lexical stress in English from speakers of SSBE. Though on first inspection, IndE speakers as a group appeared to be less sensitive to all cues except duration, further analysis distinguishing IndE speakers based on L1 revealed that L1 IndoAryan speakers actually behave very similarly to SSBE speakers, with regard to vowel quality, intensity and duration, and cue hierarchy, and differ only for pitch. In contrast, Dravidian speakers of IndE showed a clearly different perceptual response from SSBE speakers, with significantly less sensitivity to vowel quality and intensity differences, and a different cue hierarchy. At the same time, Dravidian speakers also showed a slightly (but not significantly) higher sensitivity to duration, and therefore a different cue hierarchy. One possible explanation for these differences between IndE cohorts is that they are attributable to differences in L1 structural or phonetic properties between the two language families. The limited research on such properties across the relevant languages suggests varying uses of cues to mark postlexical prominences, but without any obvious robust alignment according to language family. For example, in Telugu, Malayalam and Kannada, phonetic cues of non-lexical prominences are reported to include vowel duration [28, 29, 30, 31]. However, [24] reports Telugu listeners to be highly variable in their judgments, and for Malayalam, [32] reports listeners perceived stress to fall on the first syllable regardless of vowel duration. Furthermore, there are reportedly no reliable durational cues to prominence placement in Tamil [13]. The use of a particular acoustic cue for unrelated structural properties may be a relevant factor. Notably, only Dravidian languages have lexically contrastive vowel duration, and thus Dravidian speakers may already be more attuned to differences in vowel duration. This could explain the different positioning of duration in their cue hierarchy. With this in mind, a further step would be to differentiate L1 Hindi and Bengali speakers too. While neither language has contrastive vowel duration, there is potentially a predisposition in Hindi to associate a shorter vowel duration and more reduced quality with a lack of stress (at least in open syllables). In contrast, Bengali has no schwa and no indirect weak correlation between stress and vowel duration. A second possibility is that these differences reflect variation in the participant profile for the two language family cohorts. Initial exploration of background variables reveals a small but significant difference in the onset of their learning English, with the Dravidian speakers on average starting at a slightly later age. This may explain the relative weakness of cue sensitivity in this cohort, compared with the Indo-Aryan speakers. Besides these differences between the two IndE groups, a couple of similarities are noteworthy. Firstly, both groups clearly show much greater sensitivity to vowel quality cues than to any other cue, and this is also true for SSBE. Secondly, for both groups pitch ranks as the weakest (or, for Dravidian speakers, the joint weakest) cue, in contrast with the SSBE cohort, for whom it is weak but not the weakest. One possible explanatory avenue to explore might lie in pan-L1 interference from the preponderance of pitch accents in Indian languages (higher tonal density, low pitch word-initially). In addition to further exploration of background variables, a next step will be to analyse production data for the same participants and to investigate whether these cue strengths and hierarchies are replicated in their own speech (for example the use of vowel reduction to signal prosodic prominence). 5. REFERENCES [1] Culter, A. 2005. Lexical stress. In: Pisoni, D., Remez, R. (eds) The Handbook of Speech Perception. NY, USA: Blackwell Publishing, 264-289. [2] Lehiste, I. 1970. Suprasegmentals. Cambridge MIT Press. [3] Beckman, M. 1986. Stress and non-stress accent. Dordrecht: Foris. [4] Sluijter, A., van Heuven, V. 1996. Acoustic correlates of linguistic stress and accent in Dutch and American English. Proceedings of Proceedings of the International Congress of Spoken Language Processing, Philadelphia, PA, 630-633. [5] Okobi, A. 2006. Acoustic correlates of word stress in American English. Unpublished PhD dissertation, MIT. [6] Mennen, I., Kelly, N., Mayr, R., Morris, J. 2020. The effects of home language and bilingualism on the realization of lexical stress in Welsh and Welsh English. Front Psychol. Jan 22. Doi: 10.3389/fpsyg.2019.03038. [7] Werker, J., Tees, R. 1984. Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior & Development 7(1), 49–63. [8] Best, C., Tyler, M. 2007. Nonnative and secondlanguage speech perception: Commonalities and complementarities. In: Bohn, O.-S., Munro, M. (eds), Language experience in second language speech learning: In honour of James Emil Flege. John Benjamins, 13–34. [9] Flege, J., MacKay, I. 2004. Perceiving vowels in a second language. Studies in Second Language Acquisition 26, 1-34. [10] Tremblay, A. 2021. The past, present, and future of stress in second-language word production and recognition. In: Wayland, R. (ed) Second language speech learning: Theoretical and empirical progress. Cambridge: Cambridge University Press, 175–192. [11] Chrabaszcz, A., Winn, M., Lin, C. Y., Idsardi, W. J. 2014. Acoustic cues to perception of word stress by English, Mandarin, and Russian speakers. J. Speech. Lang. Hear. Res. 57, 1468–1479. [12] Census of India. 2011. Registrar General and Census Commissioner of India. [13] Keane, E. 2006. Prominence in Tamil. Journal of the International Phonetic Association, 36(1), 1-20. [14] Mehrotra, R. C. 1965. Stress in Hindi, Indian Linguistics, 26, 96-105 [15] Bansal, R.K. 1970. A phonetic analysis of English spoken by a group of well-educated speakers from Uttar-Pradesh. CIEFL Bulletin (Hyderabad) 8, 1-11. [16] Wiltshire, C. 2005. The “Indian English” of TibetoBurman language speakers. English World-Wide 26(3), 275–300. [17] Wiltshire, C. 2020. Uniformity and Variability in the Indian English Accent. (Elements in World Englishes). Cambridge: Cambridge University Press. [18] Maxwell, O., Fletcher, J. 2009. Acoustic and durational properties of Indian English vowels. World Englishes 28(1), 52-70. [19] Masica, C. 2005. Defining a linguistic area: South Asia. New Delhi: Chronicle Books. [20] Khan, S. D. 2016. The intonation of South Asian languages. FASAL-6, Umass Amherst. [21] Sirsa, H., Redford, M. 2013. The effects of native language on Indian English sounds and timing patterns. Journal of Phonetics 41(6), 393-406. [22] Wiltshire, C., Harnsberger, J. 2006. The influence of Gujarati and Tamil L1s on Indian English: A preliminary study. World Englishes 25(1), 91-104. [23] Pandey, P. 2016. Indian English prosody. In: Leitner, G., Hashim, A., Wolf, H.-G. (eds) Communicating with Asia, CUP: Cambridge, 56-68. [24] Lisker, L., Krishnamurti, Bh. 1991. Lexical stress in a 'stressless' language: judgments by Telugu- and English-speaking linguists. Proceedings of ICPhS1991, Université de Provence, 90–93. [25] Gargesh, R. 2004. Indian English: Phonology. In: Schneider, E., Burridge, K., Kortmann, B., Mesthrie, R., Upton, C. (eds) A handbook of varieties of English: A multimedia reference tool (Vol. 1). Berlin, Germany: Mouton de Gruyter, 992-1002. [26] Moon, R. 2002. A comparison of the acoustic correlates of focus in Indian English and American English. Unpublished Master’s thesis, University of Florida. [27] Peirce, J. 2009. Generating stimuli for neuroscience using PsychoPy. Front. Neuroinform. 2, 10. [28] Balusu, R. 2001. Acoustic correlates of stress and accent in Telugu. 21st South Asian Languages Analysis Roundtable, University of Konstanz, 7-19. [29] Asher, R. E. & Kumari, T. C., 1997. Malayalam, Routledge Descriptive Grammars, London: Routledge. [30] Narayan, S., Mahesh, S., Veeramani, P. 2019. Acoustic cue for stress perception in Kannada speaking children. Research and Reviews: Journal of Neuroscience 9(3), 16-23. [31] Savithri S.R. 1999. Perception of word stress. Proceedings of the Madras India Regional Conference of the Acoustical Society of America (2). 110–113. [32] Terzenbach, L. 2011. Malayalam prominence and vowel duration: listener acceptability. Unpublished Masters dissertation, UT, Austin.