Projects by Barbara Sonnenhauser
(2023). (Dis-)entangling traditions in the Central Balkans: Performance and perception. The case ... more (2023). (Dis-)entangling traditions in the Central Balkans: Performance and perception. The case of Torlak. In: Zeitschrift für Slavische Philologie. Forthcoming. Sonnenhauser, Barbara, Blerta Ismajli, and Paul Widmer (2023). Pairing peers and pears. Changing conventions of Gheg-Albanian heritage speakers. In: Language Dynamics and Change.
Sprachenvielfalt und Mehrsprachigkeit sind zentrale Begriffe der schweizerischen Sprachpolitik un... more Sprachenvielfalt und Mehrsprachigkeit sind zentrale Begriffe der schweizerischen Sprachpolitik und Sprachlandschaft und betreffen auch Herkunftssprachen von Migrantengruppen. Obwohl albanischsprachige Gemeinschaften (meist aus Kosovo und aus Mazedonien) seit den 1980er Jahren zu den grössten Migrantengruppen im deutschen Sprachraum und speziell in der Schweiz zählen, ist über die Sprache und das sprachliche Verhalten dieser mittlerweile mehrere Generationen umfassenden Sprechergemeinschaft trotz bedeutender Pionierwerke (Caprez-wenig bekannt. Angesichts der Tatsache, dass gesellschaftliche und wirtschaftliche Integration und Teilhabe sowie Identitätsbildung unentflechtbar mit Sprache, ermöglichter und geförderter Sprachpraxis und Sprachbewusstsein verbunden sind und immer waren, stellt die Untersuchung der herkunftssprachlichen Praxis und ihrer Interaktion mit den neuen Mehrheitssprachen ein dringendes Desiderat dar. In diesem Projekt wird ein umfassendes Bild der sprachlichen Praxis der Herkunftssprachen-sprecher des Albanischen und der verwendeten Sprache(n) über die Zeit und in diversen Kon-taktsituationen erarbeitet, indem Ansätze und Methoden der Herkunftssprachlinguistik und-di-daktik mit solchen der Kontakt-, Sozio-und Variationslinguistik kombiniert werden. Die Verbin-dung dieser Zugänge ermöglicht a) akteurzentrierte bottom-up Einblicke in kontaktinduzierte Spezifikation von Merkmalen (Bewahrung oder Veränderung), die ihrerseits die Voraussetzung für die Modellierung von Sprachkontakt in der Sprachgeschichte und Sprachevolution dar-stellen, und b) die Entwicklung anwendungsorientierter Lösungen für die Pflege von Herkunfts-sprache als Instrument der Integration via gesellschaftlicher Teilhabe und zur Stärkung der im heutigen Europa unentbehrlichen Mehrsprachigkeit. Zu allen Bereichen existieren substantielle Vorarbeiten, doch fehlen sowohl in der Herkunftssprachlinguistik und-didaktik und der Kontakt-linguistik belastbare Studien über längere Zeitverläufe unter maximal kontrollierten soziokultu-rellen Bedingungen. Dies betrifft neben Untersuchungen zum Albanischen als Herkunftssprache in deutschsprachiger Umgebung und den diversen Kontakterscheinungen in den beteiligten Varietäten, auch das mikroperspektivische Nachzeichnen der soziokulturellen Motivation und sprachlichen Prozesse bei Sprachentwicklung in spezifischen Kontaktkonfigurationen. Die sprachlichen Daten werden in drei Sprechergenerationen anhand verschiedener Stimuli elizitiert und in Familiennetzwerken sowie mit crowd sourcing erhoben, soziokulturelle Information mittels biographischer und narrativer Interviews gewonnen. Die Texte werden durch Linguisten und im crowd sourcing-Verfahren durch Herkunftssprachensprecher doppelt kodiert; für letztere können so zugleich Sprachbewusstsein und die Einstellung zur Herkunftssprache erhoben werden. Die entstehenden soziolinguistischen Sprecherprofile und Spracheinstellungen werden mit den sprachlichen Daten korreliert. Anhand der Veränderungen der Sprachstruktur über die Sprechergenerationen hinweg werden somit qualitativ und quantitativ Kontakteffekte unter gut kontrollierbaren Bedingungen extrahiert und evaluiert. Die Verbindung struktur-und soziolinguistischer Ansätze für eine umfassende Analyse der linguistischen, soziokulturellen und gesellschaftspolitischen Relevanz herkunftssprachlicher Aspekte verspricht in dieser Kombination und für alle genannten disziplinären Perspektiven einen signifikanten wissenschaftlichen Erkenntnisgewinn sowie anwendungsorientierte Pro-dukte in Form von didaktischen und pädagogischen Materialien. Konkret werden für die Kon-taktlinguistik empirische Daten zur Abschätzung von Kontakteinfluss für phylogenetische Modelle zur Verfügung gestellt, sowie Unterrichtsmaterialien für den Herkunftssprachunterricht erstellt. Die Resultate werden in Form von Qualifikationsschriften, wissenschaftlichen Publika-tionen, Infobroschüren und Webauftritten verbreitet, die erhobenen Daten auf geeigneten Repositorien (DaSCH) deponiert und verfügbar gemacht
In the tradition of Balkan studies, the focus has mostly been on the converging tendencies and th... more In the tradition of Balkan studies, the focus has mostly been on the converging tendencies and the convergent features shaping this area in linguistics and cultural aspects. Still lesser studied are the diverging tendencies and divergent features, which are to a large part triggered by various kinds of boundaries crossing this area. Moreover, the individual subjects actually shaping and perceiving this plurality-in-unity still await a detailed study. The present project takes up these desiderata, investigating on an empirical basis the role of boundaries – geographical, political, perceptual – on the transformation of linguistic and cultural traditions and their perception by the acting subjects. To do so, linguistic methods (from dialectology, areal typology, ethnolinguistics) will be combined with language technology (corpus linguistics, natural language processing, geographic information sciences). This combination of qualitative and quantitative data will allow to draw conclusions about the traditional culture of an area expressed within the framework of a local language. The focus will be on the Torlak region, which represented a linguistic and cultural unit until 1878, but which since the end of the 19th is divided by the Serbian-Bulgarian political boundary. In addition to collecting new authentic regional linguistic data in Serbia and Bulgaria and probing into the role of boundaries in linguistic and cultural transitions, the project aims at raising awareness among the speakers for the traditional values connected to Torlak and at contributing to the preservation of an endangered language and culture. An additional project outcome will be the development of new and/or improvement of existing tools for corpus linguistics and natural language processing that will also be applicable to cultural studies. Documentation, preservation and awareness raising will contribute to the sustainability of the project.
of the research plan The South Slavic dialect continuum is characterised by an intricate encounte... more of the research plan The South Slavic dialect continuum is characterised by an intricate encounter of affiliations: genealogically, it is intersected by an old bundle of isoglosses differentiating West and East South Slavic, areally, parts of it – the 'ill-bred' sons, as Schleicher (1850) called them – share a number of morpho-syntactic innovations with their neighbouring non-Slavic languages. The resulting variation becomes visible most distinctly in the Torlak dialects. Being located at the peripheries of contemporary Serbian, Bulgarian and Macedonian, they are transitional between West and East South Slavic. Yet, spoken at the outskirts of the spread zones of Balkan innovations, they are transitional between Balkan Slavic (BS) and non-Balkan South Slavic as well. A linguistic description and analysis of Torlak thus calls for bringing together insight from dialectology and areal-typology in order to study the interaction of innovations diffused through language contacts with the inherited genealogical features. The project focuses on a set of morpho-syntactic BS innovations and their diffusion and integration into the South Slavic system from a diatopic and diachronic perspective by contrasting Torlak with the surrounding varieties and by drawing on evidence from pre-standardised vernacular sources. Systematically analysing the co-occurrence of areal innovations with inherited features will facilitate a principled description and analysis of Torlak, provide new insight into the history of Balkan Slavic and contribute to a better understanding of dialectal and areal contact and the concomitant processes of convergence and divergence. The combination of a diachronic and diatopic perspective necessitates the comparison of non-standard data along two dimensions: (i) eastern BS with western BS varieties, and both with non-BS Serbian in order to trace the geographical extension of the relevant structures in their formal and functional aspects, (ii) contemporary data with earlier stages displayed in 18 th /19 th c. literary sources in order to gain insight into the diachronic diffusion of BS innovations. The focus will be on three features: (1) clitic doubling, (2) postponed definiteness marking, (3) existence of analytic and synthetic past tenses. These features are also found in languages outside the BS and Balkan area, which places the project in the larger context of (1') conditions on clitic doubling in neighbouring non-Slavic languages, (2') postponed definiteness markers in North Russian, (3') development of (former) perfects in Slavic. This in turn opens up a broader contact linguistic and areal-typological perspective, concerning in particular (1'') Slavic – Albanian/ Romance, (2'') Slavic – Baltic – Finno-Uralic, and (3'') Slavic within the linguistic Europe. The analysis will be based on annotated corpora for each of the varieties to be compared. Providing more fine-grained data, it becomes possible to establish correlations between features and structures and hence reveal usage conditions, illustrate converging and diverging developments, in particular as concerns their functions, and to map the data in time and space by geo-referencing and visualising them with GIScience technology. To this end, existing processing tools will be improved in such a manner that they can be applied to these still under-resourced languages. This methodological aspect adds a further dimension to the project, beyond its contribution to (Slavic) dialect syntax and the linking of dialectology and areal typology.
The South Slavic languages – Slovene, Bosnian-Croation-Serbian, Bulgarian, and Macedonian – const... more The South Slavic languages – Slovene, Bosnian-Croation-Serbian, Bulgarian, and Macedonian – constitute a quite heterogeneous group of languages. One the one hand, they form a dialect continuum, while on the other hand, they are part of different areal convergence zones. The intersection of dialectal and areal affiliation is further complicated by the individual standardisation histories, which may have led to differentiations on the level of the standard languages that are not necessarily observed for the spoken varieties and dialects. This makes the investigation of non-standard and historical data an urgent requirement.
One case in point illustrating this complex situation is clausal complementation (CC), the topic of the proposed workshop. The linguistic complexity of South Slavic has been impeding studies covering the unity and diversity of the CC strategies encountered for these languages in their entirety. Moreover, CC itself is still an under-defined notion. Crucially, discussing CC in South Slavic from various points of view, applying different theoretical approaches and methods of investigation, as shall be done by the contributions to this workshop, necessitates a clear elaboration of the key notions used in the individual analyses.
Against this background, the intended workshop aims at gaining a detailed picture of the diverse patterns of CC in South Slavic by bringing together researchers which focus on different aspects of this topic. Striving to arrive at a clearer concept of CC, the workshop also contributes to linguistic concept formation, which will enhance further collaborative investigations. In addition, the workshop intends to explore the possibilities of creating a unified platform of corpora and similar kinds of data necessary for empirical research in this domain.
Brozović (1988) beschreibt das gegenwärtige Slovenische als 'eine der originellsten und individue... more Brozović (1988) beschreibt das gegenwärtige Slovenische als 'eine der originellsten und individuellsten' slavischen Sprachen, als ein 'hochgradig individuelles sprachliches Phänomen'. Dies führt er auf einen Komplex an Faktoren zurück, nämlich "its material base, the way it evolved, its special paths of development and […] the specific circumstances in which it was elaborated" (1988: 185). Damit ist zugleich das Forschungsprogramm für das vorliegende Projekt skizziert, das am Beispiel des Slovenischen die Interaktion von interner Entwicklung, externen Impulsen sowie metasprachlichen Faktoren nachzeichnen will, die zur funktionalen Entwicklung sprachlicher Strukturen beiträgt. Besondere Aufmerksamkeit gilt der Ausformung der Standardvarietät und ihrer metasprachlichen Einordnung innerhalb des Slavischen und im weiteren ausserslavischen Kontext. Die diachrone Entwicklung wird anhand synchroner Schnitte nachgezeichnet, die sich v.a. auf das 16. Jh. als Zeit des Beginns der Literatur-sprachlichkeit, auf die zweite Hälfte des 18. Jh. als Zeit des Entstehens regionaler literatursprachlicher Varianten, das 19. Jh. als Beginn der Standardisierung sowie das gegenwärtige Slovenische als etablierte Standardsprache konzentrieren. Wie jede Standardsprache ist auch die slovenische zu einem grossen Teil Ergebnis einer intentionalen Auswahl aus einer Menge an Varianten. Auch auf der Beschreibungsebene werden in der Regel nicht alle Strukturen berücksichtigt; insbesondere trifft dies auf den lexikalisch-semantischen Bereich zu. Die standardsprachliche Restriktion struktureller Varianz und der spezifische Beschreibungsfokus können so den Eindruck der Besonderheit des Slovenischen verstärken, während der Blick auf alternative Entwicklungen und (noch) nicht erfasste Distinktionen diesen relativieren könnte. Entsprechend werden im vorgelegten Projekt nicht nur die Entwicklung im Standardslovenischen und seinen Vorläufern berücksichtigt, sondern auch die Verhältnisse an der Peripherie des slovenischen Sprachgebiets sowie im angrenzenden Süd-und Westslavischen. Zu untersuchen ist zudem der Einfluss der umgebenden nichtslavischen Kontaktsprachen, insbesondere des Deutschen. Auf der metasprachlichen Ebene sind die Rolle der mitteleuropäischen zeitgenössischen Grammatikschreibung sowie der Einfluss des sprachlichen Hintergrunds der verschiedenen Grammatikschreiber und Normierer zu prüfen. Die Frage der Individualität des Slovenischen und seiner innerslavischen sowie arealen Integration wird anhand von morphosyntaktischen und lexikogrammatischen Strukturen untersucht, die die spezifische Position des Slovenischen innerhalb des (Süd-)Slavischen, seine Stellung zwischen West-und Südslavisch sowie seine ausserslavische Kontaktsituation in besonderem Mass kennzeichnen: Relativsatzbildung mit ki oder kateri, Verwendung von Supin oder Infinitiv, Versprachlichung des Konzepts WISSEN durch vedeti oder znati, sowie Verwendung von znati, lahko oder (ne) moči im Bereich der Possibilitätsmodalität. Durch die Analyse von objekt-und metasprachlichen Faktoren sowie deren Interaktion wird der Einfluss von Datenanalyse, Deskription und Präskription auf die interne Sprachentwicklung und die linguistische Einordnung des Slovenischen nachgezeichnet. Neben einer konzisen funktionalen Analyse der relevanten
Ziel dieses Projekts ist es, die Grundlagen für eine automatische Übersetzung des russischen Verb... more Ziel dieses Projekts ist es, die Grundlagen für eine automatische Übersetzung des russischen Verbalaspekts ins Deutsche zu schaffen. Dies soll mit Hilfe des Übersetzungsprogramms ETAP geschehen, das in vielerlei Hinsicht bereits ausgereift ist, jedoch im Bereich des grammatischen Aspekts bisher nur rudimentäre Regeln verwendet. Die Aufbereitung dieses anspruchsvollen Bereichs des Russischen für automatische Übersetzungen stellt damit noch ein großes Desiderat dar. In der ersten Phase des Projekts, für die die Anschubfinanzierung beantragt wird, sollen im intensiven, auch persönlichen, Austausch zwischen dem Institut für Slavische Philologie (LMU) und der Abteilung für Computerlinguistik (Russische Akademie der Wissenschaften) die grundlegenden Arbeitsschritte besprochen und geplant werden, die dann in der längerfristigen Perspektive bearbeitet, umgesetzt und als Regeln in ETAP implementiert werden können.
Papers by Barbara Sonnenhauser
Italian Journal of Linguistics, 2024
Loss of illocution, the presence of a clause-initial connective and word order are usually taken ... more Loss of illocution, the presence of a clause-initial connective and word order are usually taken to be indicators of structural asymmetries in clause combining.
In this exploratory study, we aim to operationalize the relation between word
order and clause type, using Russian, Polish, and Slovene as representatives of the three major Slavic branches. On the assumption that clause-initial function words may indicate subordination, we analyze the distribution of the presumably unequivocal complementizers że (Pol.), čto (Ru.) and da (Slv.), and compare them with Pol. niech, Ru. pust’ and Slv. naj. These elements function as illocutionary force indicating devices for directive speech acts, but at the same time show complementizer-like properties when introducing a clause that follows a clause containing a complement-taking predicate. Using corpus data from two different diachronic stages we try to establish diachronic and cross-linguistic patterns that provide information on possible links between word order and the ‘complementizerhood’ of a clause-initial element. Our findings reveal that the concepts of word order and connective, i.e. the very concepts, that are often used for diagnosing subordination, seem to be ill-defined and need to be reconsidered on the basis of thorough empirical research.
Research Square (Research Square), Sep 14, 2022
This paper presents the Gheg Albanian Pear Stories treebank to be released in Nov. 2022, which is... more This paper presents the Gheg Albanian Pear Stories treebank to be released in Nov. 2022, which is the first resource for Gheg in the Universal Dependencies (UD) treebank collection (Nivre et al. 2020). It also provides a special combination of spoken modality and heritage language, which both are underrepresented in UD and corpus resources in general. We provide a short description of the grammatical features of Gheg, and how they translate to categories in the UD annotation scheme in contrast with the Standard Albanian resources of Kote et al. (2019) and Toska, Nivre, and Zeman (2020). Special reference is given to the challenges arising from the spoken modality and the multilingual context, like disfluency, repair, and code-switching.
Zeitschrift für Slavistik, 2023
The traditional classificatory dichotomy of coordination vs. subordination in clause linkage has ... more The traditional classificatory dichotomy of coordination vs. subordination in clause linkage has long been replaced by parametric concepts (e. g. Lehmann 1988, Weiss 1989, Raible 1992, Gast & Diessel 2012). Different parameters operate on different levels (semantic: coreference vs. modification, syntactic: degree of sententiality, pragmatic: discourse prominence, etc.) and interact with each other. The terms “integration” and “autonomy” are the overarching labels that summarize the parameters at the two ends of the scale. Complete integration of one clause into another eventually leads to the absorption of one clause and to a monoclausal structure.
Complete autonomy means the juxtaposition of two clauses each having their own information structure, accompanied by the lack of actual cross-clause syntactic relations. These approaches fit the data more appropriately because they allow to model the oscillation of a given structure in relation to two ends of a scale from a synchronic point of view as well as in its diachronic dimension. In addition, parametric approaches are able to capture certain features that seemingly unconnected constructions might have in common, such as the similarities between relative clauses and NP complementation. However, the description and analysis of clause linkage still shows quite a few blank areas from both an empirical and theoretical point of view. The most challenging task is the identification of relevant parameters and the creation of a viable integrative model of these parameters. This holds in particular for non-standard varieties and varieties in earlier stages.
Zeitschrift für Slawistik, 2023
Linguistic expressions are very often indeterminate and have several possible readings. Syntactic... more Linguistic expressions are very often indeterminate and have several possible readings. Syntactic indeterminacy may result from different underlying structures, from polysemous elements, or from so-called oscillation. Unlike the former two, oscillation cannot be reduced to one or more discrete interpretations in a given context, it is fundamentally non-resolvable. While oscillating structures do not hamper successful communication, they present a problem for linguistic analyses and categorizations. The relevant literature does not address oscillation systematically; if anything, it is treated in connection with problems of categorization.
In this paper, oscillation is perceived as a phenomenon sui generis. We first give a definition of oscillation against the backdrop of other indeterminate structures and outline their relevance for a proper understanding of older and less formal varieties. Then we propose a tentative typology of oscillating structures and their potential triggers in selected Slavic languages.
Zeitschrift fšr Slavische Philologie, 2023
Torlak varieties are spoken in a geographic area where the spread of Balkan Slavic features has s... more Torlak varieties are spoken in a geographic area where the spread of Balkan Slavic features has shaped local, genealogically West South Slavic idioms in characteristic ways. As a result, they have been recognized by dialectologists as a distinct group of dialects1. The formation of this dialect complex by the diffusion of Balkan Slavic features was facilitated by a particular configuration of political and social boundaries up to the end of the 19th century. More recently, socio-political events have been changing the region and, concomitantly, the interactional spaces and communicative habits of its residents, fostering and/or inhibiting social encounters and language contact. The most far-reaching changes have been the demarcation of political boundaries and the establishment of the Serbian and Bulgarian standard language. Both developments contributed to slowing down and eventually reversing formerly convergent processes (see Sikimić et al., this volume).
Consequently, the varieties encountered in this region can be expected to be transitional along two dimensions from a contemporary perspective: horizontally, i.e., in areal respects, by variation in the manifestation of specific structures as ‘Balkan Slavic’ or ‘West South Slavic’, and vertically, i.e., register-based, in the manifestation as ‘dialectal’ or ‘Standard Serbian/Bulgarian’. Focusing on the Serbian part of the region, the present paper aims at assessing the position of the contemporary Timok variety along the areal/horizontal and register/vertical dimensions on the basis of four representative dialect features from nominal and verbal domains: marking of indirect object and possessor, post-positive demonstratives, particle usage of dative reflexive si and auxiliary omission in the perfect tense.
Each of these features can be realized in a ‘Balkan Slavic’ (i.e., dialectal / prototypically Timok) or ‘Serbian’ (i.e., standard Serbian) form. Measuring the usage frequencies of both realizations and their respective ratios reveals the overall degree of variation. Investigating the influence of specific linguistic factors on the respective options will demonstrate whether the choice of options is functionally conditioned; i.e., whether the distributions attest to formal and/or functional differentiations. Analyzing the effect of socio-geographic factors on the distribution of options for each feature gives insight to whether the distribution of ratios between one or the other option relates to the embedding of users in particular geographic and social contexts.
In a larger perspective, the specific case of Timok is representative for the more general challenge in dialectological and areal research: identifying and discriminating the internal and external conditions triggering variation and the features affected by these conditions. As under a magnifying glass, zooming in on this rather small region—in both socio-geographic and linguistic respects—offers insight into the intricate interaction of drivers of variation and eventual change.
The paper is structured as follows: Section 2 places our approach into the tradition of research on Torlak and introduces the corpus used for the present study. The usage frequencies of the diatopic and diastratic variants possible for four morphosyntactic features under consideration and the potential linguistic conditions underlying their distribution are identified in Section 3, while Section 4 is concerned with the impact of extra-linguistic factors. The findings are discussed in Section 5. Section 6 provides a short conclusion.
Zeitschrift für Slavische Philologie, 2023
The Torlak dialect area remains underinvestigated: there exists no grammar, linguistic atlas, dic... more The Torlak dialect area remains underinvestigated: there exists no grammar, linguistic atlas, dictionary, or sociolinguistic or ethnological review of it. Being simultaneously a West South Slavic and Balkan Slavic dialect, variative in space and changing in time, divided by political borders and roofed by different standard languages, it is highly challenging for dialectological, sociolinguistic, computational and corpus linguistic enterprises to make a thorough investigation of this variety relevant also beyond the narrow scope of Balkan / South Slavic linguistics. This specific example of Torlak serves to illustrate the complexities (practical, linguistic, anthropological, and methodological) involved in compiling a dialectologically relevant (reflecting the “base” dialect and/or the regiolect on all levels of the language structure) and sociolinguistically informed (reflecting the linguistic variation within a certain space and covering this variation by including a well-balanced sample of all different kinds of speakers) language corpus (including data sampling and processing) and to outline the benefits gained from that enterprise (language documentation, linguistic and anthropological analysis, and development of tools).
Language Dynamics and Change, 2023
Migration events splitting speaker communities and establishing novel contact situations are amon... more Migration events splitting speaker communities and establishing novel contact situations are among the major drivers of language variation and change. While the precise processes that lead to change cannot usually be determined for past events with any certainty, the study of minority and heritage language usage in apparent time may provide insight into the contribution of the linguistic behavior underlying the dynamics. We capitalize on this and compare parts of speech usage in Pear Story renarrations across Gheg Albanian speakers of three generations in German speaking environments, applying methods from information theory. The results suggest that the changing conventions in parts of speech usage across generations and places of residence can be attributed to changing linguistic behavior within the speaker community in the migration setting. These findings highlight the impact of changing sociocultural embedding and the roles of vertical and horizontal transmission in language change.
Russian Linguistics
Russia's war against Ukraine threatens its people and its existence as an independent state. Russ... more Russia's war against Ukraine threatens its people and its existence as an independent state. Russia's war threatens Ukraine's cultural heritage by denying its history and its language. Language is being used as a weapon. In face of this aggression, a journal such as Russian Linguistics cannot remain silent. As linguists we are committed to the principles of science, based on empirical observations. For us as Slavic linguists, it is a truism to consider Russian as one among a wealth of Slavic languages and as a member of the East Slavic branch, together with Ukrainian and Belarusian. Recognizing this diversity is what the journal's subtitle stands for: International Journal for the Study of Russian and other Slavic Languages. As editors we want to do our part to promote knowledge about the Ukrainian language and Ukrainian linguistics. This issue is therefore devoted in its entirety to Ukrainian linguistics. We are very happy to gather 10 papers in this special issue which are devoted to a range of different aspects of Ukrainian linguistics, including grammatical constructions, corpus linguistics, historical linguistics, sociolinguistics and the use of language in propaganda. Some of the papers pick up on long-standing debates and summarise the state of the art, some offer cutting-edge results from ongoing projects. Olena Pchelintseva draws on extensive dictionary and survey data to discuss Ukrainian action nominals and the extent to which they are aspectually paired. She finds considerable evidence in both data types to support the claim that they are. Ljudmila Popović provides an analysis of the various functions of pluperfect (PQP) in Ukrainian. Among other things she shows that the PQP in modern Ukrainian still performs a taxis (temporal orientation) function. However, this relative tense also serves to indicate the result, which was relevant in the past, but lost its significance at the time of speech, as well as functions of delimitation of the temporal zone marked by PQP from the moment of speech. The study also pays particular attention to the so-called counterfactual function of the Ukrainian PQP. Jan Fellerer discusses the retention of auxiliary clitics and frequency of subject pro-drop in SouthWestern Ukrainian and SouthEastern "Borderland" Polish, demonstrating a convergence of patterns due to long-standing bi-directional language contact. Zaidan Lahjouji-Seppälä, Achim Rabus and Ruprecht von Waldenfels use stylometric methods on extensive corpus data from the General Regionally Annotated Corpus of
Der vorliegende Aufsatz hat vor allem das Supinum im Slovenischen zum Gegenstand. Zum einen wird ... more Der vorliegende Aufsatz hat vor allem das Supinum im Slovenischen zum Gegenstand. Zum einen wird die Supinalkonstruktion als solche vorgestellt, zum anderen wird die Entwicklung in den Grammatiken und Lehrbüchern aufgezeigt. Dies erfolgt aus zweierlei Perspektiven: Anhand der Analyse der Grammatikschreibung des Slovenischen, zudem werden beschriebene Fakten aus den Grammatiken anhand der zur Verfügung stehenden Korpora überprüft und so ein Abgleich zwischen der Grammatikschreibung und dem Sprachgebrauch ermöglicht. Von besonderem Interesse sind dabei zum einen die Definition des Bewegungsverbes und zum anderen die Rektion mit den Kasus Genitiv und Akkusativ als auch die Frage des Verbaspektes im Supinum.
Uploads
Projects by Barbara Sonnenhauser
One case in point illustrating this complex situation is clausal complementation (CC), the topic of the proposed workshop. The linguistic complexity of South Slavic has been impeding studies covering the unity and diversity of the CC strategies encountered for these languages in their entirety. Moreover, CC itself is still an under-defined notion. Crucially, discussing CC in South Slavic from various points of view, applying different theoretical approaches and methods of investigation, as shall be done by the contributions to this workshop, necessitates a clear elaboration of the key notions used in the individual analyses.
Against this background, the intended workshop aims at gaining a detailed picture of the diverse patterns of CC in South Slavic by bringing together researchers which focus on different aspects of this topic. Striving to arrive at a clearer concept of CC, the workshop also contributes to linguistic concept formation, which will enhance further collaborative investigations. In addition, the workshop intends to explore the possibilities of creating a unified platform of corpora and similar kinds of data necessary for empirical research in this domain.
Papers by Barbara Sonnenhauser
In this exploratory study, we aim to operationalize the relation between word
order and clause type, using Russian, Polish, and Slovene as representatives of the three major Slavic branches. On the assumption that clause-initial function words may indicate subordination, we analyze the distribution of the presumably unequivocal complementizers że (Pol.), čto (Ru.) and da (Slv.), and compare them with Pol. niech, Ru. pust’ and Slv. naj. These elements function as illocutionary force indicating devices for directive speech acts, but at the same time show complementizer-like properties when introducing a clause that follows a clause containing a complement-taking predicate. Using corpus data from two different diachronic stages we try to establish diachronic and cross-linguistic patterns that provide information on possible links between word order and the ‘complementizerhood’ of a clause-initial element. Our findings reveal that the concepts of word order and connective, i.e. the very concepts, that are often used for diagnosing subordination, seem to be ill-defined and need to be reconsidered on the basis of thorough empirical research.
Complete autonomy means the juxtaposition of two clauses each having their own information structure, accompanied by the lack of actual cross-clause syntactic relations. These approaches fit the data more appropriately because they allow to model the oscillation of a given structure in relation to two ends of a scale from a synchronic point of view as well as in its diachronic dimension. In addition, parametric approaches are able to capture certain features that seemingly unconnected constructions might have in common, such as the similarities between relative clauses and NP complementation. However, the description and analysis of clause linkage still shows quite a few blank areas from both an empirical and theoretical point of view. The most challenging task is the identification of relevant parameters and the creation of a viable integrative model of these parameters. This holds in particular for non-standard varieties and varieties in earlier stages.
In this paper, oscillation is perceived as a phenomenon sui generis. We first give a definition of oscillation against the backdrop of other indeterminate structures and outline their relevance for a proper understanding of older and less formal varieties. Then we propose a tentative typology of oscillating structures and their potential triggers in selected Slavic languages.
Consequently, the varieties encountered in this region can be expected to be transitional along two dimensions from a contemporary perspective: horizontally, i.e., in areal respects, by variation in the manifestation of specific structures as ‘Balkan Slavic’ or ‘West South Slavic’, and vertically, i.e., register-based, in the manifestation as ‘dialectal’ or ‘Standard Serbian/Bulgarian’. Focusing on the Serbian part of the region, the present paper aims at assessing the position of the contemporary Timok variety along the areal/horizontal and register/vertical dimensions on the basis of four representative dialect features from nominal and verbal domains: marking of indirect object and possessor, post-positive demonstratives, particle usage of dative reflexive si and auxiliary omission in the perfect tense.
Each of these features can be realized in a ‘Balkan Slavic’ (i.e., dialectal / prototypically Timok) or ‘Serbian’ (i.e., standard Serbian) form. Measuring the usage frequencies of both realizations and their respective ratios reveals the overall degree of variation. Investigating the influence of specific linguistic factors on the respective options will demonstrate whether the choice of options is functionally conditioned; i.e., whether the distributions attest to formal and/or functional differentiations. Analyzing the effect of socio-geographic factors on the distribution of options for each feature gives insight to whether the distribution of ratios between one or the other option relates to the embedding of users in particular geographic and social contexts.
In a larger perspective, the specific case of Timok is representative for the more general challenge in dialectological and areal research: identifying and discriminating the internal and external conditions triggering variation and the features affected by these conditions. As under a magnifying glass, zooming in on this rather small region—in both socio-geographic and linguistic respects—offers insight into the intricate interaction of drivers of variation and eventual change.
The paper is structured as follows: Section 2 places our approach into the tradition of research on Torlak and introduces the corpus used for the present study. The usage frequencies of the diatopic and diastratic variants possible for four morphosyntactic features under consideration and the potential linguistic conditions underlying their distribution are identified in Section 3, while Section 4 is concerned with the impact of extra-linguistic factors. The findings are discussed in Section 5. Section 6 provides a short conclusion.
One case in point illustrating this complex situation is clausal complementation (CC), the topic of the proposed workshop. The linguistic complexity of South Slavic has been impeding studies covering the unity and diversity of the CC strategies encountered for these languages in their entirety. Moreover, CC itself is still an under-defined notion. Crucially, discussing CC in South Slavic from various points of view, applying different theoretical approaches and methods of investigation, as shall be done by the contributions to this workshop, necessitates a clear elaboration of the key notions used in the individual analyses.
Against this background, the intended workshop aims at gaining a detailed picture of the diverse patterns of CC in South Slavic by bringing together researchers which focus on different aspects of this topic. Striving to arrive at a clearer concept of CC, the workshop also contributes to linguistic concept formation, which will enhance further collaborative investigations. In addition, the workshop intends to explore the possibilities of creating a unified platform of corpora and similar kinds of data necessary for empirical research in this domain.
In this exploratory study, we aim to operationalize the relation between word
order and clause type, using Russian, Polish, and Slovene as representatives of the three major Slavic branches. On the assumption that clause-initial function words may indicate subordination, we analyze the distribution of the presumably unequivocal complementizers że (Pol.), čto (Ru.) and da (Slv.), and compare them with Pol. niech, Ru. pust’ and Slv. naj. These elements function as illocutionary force indicating devices for directive speech acts, but at the same time show complementizer-like properties when introducing a clause that follows a clause containing a complement-taking predicate. Using corpus data from two different diachronic stages we try to establish diachronic and cross-linguistic patterns that provide information on possible links between word order and the ‘complementizerhood’ of a clause-initial element. Our findings reveal that the concepts of word order and connective, i.e. the very concepts, that are often used for diagnosing subordination, seem to be ill-defined and need to be reconsidered on the basis of thorough empirical research.
Complete autonomy means the juxtaposition of two clauses each having their own information structure, accompanied by the lack of actual cross-clause syntactic relations. These approaches fit the data more appropriately because they allow to model the oscillation of a given structure in relation to two ends of a scale from a synchronic point of view as well as in its diachronic dimension. In addition, parametric approaches are able to capture certain features that seemingly unconnected constructions might have in common, such as the similarities between relative clauses and NP complementation. However, the description and analysis of clause linkage still shows quite a few blank areas from both an empirical and theoretical point of view. The most challenging task is the identification of relevant parameters and the creation of a viable integrative model of these parameters. This holds in particular for non-standard varieties and varieties in earlier stages.
In this paper, oscillation is perceived as a phenomenon sui generis. We first give a definition of oscillation against the backdrop of other indeterminate structures and outline their relevance for a proper understanding of older and less formal varieties. Then we propose a tentative typology of oscillating structures and their potential triggers in selected Slavic languages.
Consequently, the varieties encountered in this region can be expected to be transitional along two dimensions from a contemporary perspective: horizontally, i.e., in areal respects, by variation in the manifestation of specific structures as ‘Balkan Slavic’ or ‘West South Slavic’, and vertically, i.e., register-based, in the manifestation as ‘dialectal’ or ‘Standard Serbian/Bulgarian’. Focusing on the Serbian part of the region, the present paper aims at assessing the position of the contemporary Timok variety along the areal/horizontal and register/vertical dimensions on the basis of four representative dialect features from nominal and verbal domains: marking of indirect object and possessor, post-positive demonstratives, particle usage of dative reflexive si and auxiliary omission in the perfect tense.
Each of these features can be realized in a ‘Balkan Slavic’ (i.e., dialectal / prototypically Timok) or ‘Serbian’ (i.e., standard Serbian) form. Measuring the usage frequencies of both realizations and their respective ratios reveals the overall degree of variation. Investigating the influence of specific linguistic factors on the respective options will demonstrate whether the choice of options is functionally conditioned; i.e., whether the distributions attest to formal and/or functional differentiations. Analyzing the effect of socio-geographic factors on the distribution of options for each feature gives insight to whether the distribution of ratios between one or the other option relates to the embedding of users in particular geographic and social contexts.
In a larger perspective, the specific case of Timok is representative for the more general challenge in dialectological and areal research: identifying and discriminating the internal and external conditions triggering variation and the features affected by these conditions. As under a magnifying glass, zooming in on this rather small region—in both socio-geographic and linguistic respects—offers insight into the intricate interaction of drivers of variation and eventual change.
The paper is structured as follows: Section 2 places our approach into the tradition of research on Torlak and introduces the corpus used for the present study. The usage frequencies of the diatopic and diastratic variants possible for four morphosyntactic features under consideration and the potential linguistic conditions underlying their distribution are identified in Section 3, while Section 4 is concerned with the impact of extra-linguistic factors. The findings are discussed in Section 5. Section 6 provides a short conclusion.
Analysing data from the 16th to the early 20th century, retrieved from the Corpus of older Slovene literature (IMP), the focus will be on the degree of clausal integration as a means to assess the syntactic status of naj.
Contrary to classical historical corpora, a truly diachronic corpus should provide insight into different snapshots in time in order to reflect synchronic variability and, in addition, mirror the dynamicity of linguistic development. To that end, the data included need to be as comparable as possible. This can be achieved by designing the corpus as a parallel corpus which aligns representative versions – in terms of time and (linguistic) origin – of one and the same text. Ideally suited for the project under discussion is Patriarch Euthymius’ Life of Petka Tărnovska (Paraskeva of Epibatai). Together with its vernacular versions contained in damaskini it represents a case in point for the literary and linguistics changes and the abundance of textual versions occurring between the medieval and early modern period of Slavonic written language on the Balkans.
In order to ensure – as far as possible – reliability and validity of the data gathered in the corpus, the inclusion of philological expertise concerning the social, historical and communicative embedding of the selected texts is of utmost importance. The necessary combination of linguistic and philological know-how can be achieved by joining two principle ways of accessing historical data: by means of a linguistic corpus and by means of a philological edition.
This integrated approach constitutes the surplus value of the A-MESS project, which at the same time contributes to the development of the technical means necessary for the corpus linguistic processing of lesser studied languages.
Beiden Aspekten zugrunde liegt die explizite oder implizite Konzeption von sprachlichen Elementen als Zeichen. Diese semiotische Auffassung von Sprache, die es ermöglicht, sowohl den Bezug von Sprache auf die Welt als auch die Verbindung von Form und Bedeutung zu erfassen, bildet den Ausgangspunkt des Vortrags. In einem nächsten Schritt wird skizziert, inwiefern Sprache einerseits als Modell für ‘die Welt’ dienen kann und auf welche grundlegenden Modelle anderseits die Beschreibung und Erklärung ihrer Funktionen und Strukturen zurückgeführt werden kann.
Anhand ausgewählter Bereiche werden schließlich jeweils unterschiedliche Modellierungen vorgestellt. Dabei wird es u.a. um die Funktionen von Sprache gehen, um die Position von Sprache innerhalb der Kognition des Menschen, um Sprachwandel, den regelbasierten Aufbau von Wörtern zu Sätzen, sowie die Bedeutung sprachlicher Einheiten. Ein kurzer Ausflug in die automatische Sprachverarbeitung zeigt, inwiefern Modelle in der Linguistik nicht nur beschreibend oder erklärend, sondern auch prognostizierend verwendet werden können.
Zum Abschluß werden einige Probleme im Zusammenhang mit dem Erstellen und Anwenden von Modellen in der Sprachbeschreibung diskutiert.
Against this background, the tautological relative clause wer kann, der kann (roughly: ‘those who can, do’) proves an interesting test case for both approaches: having a formal and functional equivalent in Slovene (kdor zna, zna) but not in Russian, it is problematic for both radical approaches. Instead, it emerges as a prime example of the interaction of semantics and pragmatics. The present paper aims at accounting for this interaction by drawing on three aspects: 1) the relative clause structure, 2) the specific kind of tautology at hand, 3) the semantics of the modal verbs involved (znati ‘know; can’ and können ‘can’).
Since with respect to the semantics of the modal, Slovene and German turn out to be more similar to each other than to other members of their language families (such as Russian and English), wer kann, der kann is revealing not only for the semantics-pragmatics interface, but also for questions concerning areal linguistics.
Vor diesem Hintergrund wird in der vorliegenden Untersuchung eine Neukonzeption von Subjektivität im Rahmen einer triadischen Semiotik und nicht-aristotelischen Logik vorgeschlagen, die eine Anwendung auf Sprache und sprachliche Phänomene erlaubt, ohne mit den angedeuteten Problemen konfrontiert zu sein. Eine zentrale Rolle kommt dabei dem abduktiven Schließen sowie der indexikalischen Zeichendimension zu. Illustriert wird das Erklärungspotential dieser philosophisch, semiotisch und linguistisch fundierten Konzeption von Subjektivität anhand des definiten Artikels im Makedonischen, des sogenannten Renarrativs im Bulgarischen und ausgewählter delokutiver Partikeln im Russischen.
Thanks to Catharina Krebs-Garić (Vienna) for establishing the corpus and the Austrian Science Fund FWF for funding (project On the emergence of narrativity in Early Neo-Balkan Slavic; grant number M 1536-G23)
Presented at the BSSC 2022 Conference in Columbus, Ohio, on April 8th 2022.
Presented at the conference "Historical Corpora and Variation" on 4th-5th of April 2019, at the University of Cagliari.