Papers by Gloria Corpas Pastor
John Benjamins Publishing Company eBooks, Dec 8, 2021
Names: A Journal of Onomastics, Jun 8, 2023
Lecture Notes in Computer Science, 2022
Sistemas fraseológicos en contraste: enfoques computacionales y de corpus, Oct 4, 2021
This is an accepted manuscript of a chapter published by Editorial Comares in Sistemas fraseológi... more This is an accepted manuscript of a chapter published by Editorial Comares in Sistemas fraseológicos en contraste: Enfoques computacionales y de corpus, available online: https://www.comares.com/libro/sistemas-fraseologicos-en-contraste_130637/ The accepted version of the publication may differ from the final published version. For re-use please see publisher's terms and conditions.Today, automatic speech recognition is beginning to emerge strongly in the field of interpreting. Recent studies point to this technology as one of the main documentation resources for interpreters, among other possible uses. In this paper we present a novel documentation methodology that involves semi-automatic compilation of comparable corpora (transcriptions of oral speeches) and automatic corpus compilation of written documents on the same topic with a view to preparing an interpreting assignment. As a convenient background, we provide a brief overview of the use of automatic speech recognition in the context of interpreting technologies. Next, we will detail the protocol for designing and compiling our comparable corpora that we will exploit for analysis. In the last part of the paper, we will cover phraseology extraction and study some collocational patterns in both corpora. Mastering the specific phraseology of the specific subject matter of the assignment is one of the main stumbling blocks that interpreters face in their daily work. Our ultimate aim is to establish whether oral corpora could be of further benefit to the interpreter in the preliminary preparation phase
Remote interpretation technology is developing extremely fast, enabling affordable and instant ac... more Remote interpretation technology is developing extremely fast, enabling affordable and instant access to interpreting services worldwide. This paper focuses on the subjective perceptions of public service interpreters about the psychological and physical impact of using remote interpreting, and the effects on their own performance. To this end, a survey study has been conducted by means of an on-line questionnaire. Both structured and unstructured questions have been used to tap into interpreters' view on technology, elicit information about perceived effects, and identify pitfalls and prospects.
Interpreting in a Changing World: New Scenarios, Technologies, Training Challenges and Vulnerable GroupsLa Interpretación Un Mundo Cambiante: Nuevos Escenarios, Tecnologías, Retos Formativos Y Grupos Vulnerables, Feb 27, 2020
This chapter deals with the use of distance interpreting technologies and their impact on public ... more This chapter deals with the use of distance interpreting technologies and their impact on public services interpreters. Remote (or distance) interpreting offers a wide range of solutions in order to successfully satisfy the pressing need for languages services in both the public and private sectors. This study focuses on telephone-mediated and video-mediated interpreting, presenting their advantages and disadvantages. We have designed a survey to gather data about the psychological and physiological impact that remote interpreting technologies generate in community interpreters. Our main aim is to ascertain interpreters' general view on technology, so as to detect deficiencies and suggest ways of improvement. This study is a first contribution in the direction of optimising distance interpreting technologies. Current demand reveals the enormous potential of distance interpreting, its rapid evolution and generalised presence that this modality will have in the future. Keywords: distance interpreting, remote interpreting, interpreting technologies, psychological impact, interpreting at public services, video-mediated interpreting. 1 El presente trabajo se ha realizado en el marco del proyecto VI-Sistema integrado voz-texto para intérpretes (N.º de ref. FFI2016-75831-P, MINECO).
Hermēneus. Revista de traducción e interpretación, 2022
Interpreting technologies have abruptly entered the profession in recent years. However, technolo... more Interpreting technologies have abruptly entered the profession in recent years. However, technology still remains a relatively marginal topic of academic debate, although interest in developing tailor-made solutions for interpreters has risen sharply. This paper presents the VIP system, one of the research outputs of the homonymous project VIP - Voice-text Integrated system for interPreters, and its continuation (VIP II). More specifically, a technology-based terminology workflow for simultaneous interpretation is presented.
Studies on Multilingual Lexicography, 2019
There is currently a pressing need to develop specific applications for translators as final user... more There is currently a pressing need to develop specific applications for translators as final users, with the purpose of fulfilling their particular professional requirements. Corpora and advanced lookup options bring benefits to translation and open up a wealth of opportunities in research. This paper presents Inteliterm, an innovative and integrated tool which combines corpus management tools with different types of searches and customisation options in order to enhance translation results and minimise translators' efforts when searching for terminology. Section 1 provides a brief foreground glimpse of the project rationale. Section 2 delves into the comprehension assistants developed in the 90's as a first step towards present-day intelligent dictionaries. Section 3 provides the theoretical foundation of the novel tool Inteliterm. Beside its proactive translation support functions, this web application also provides a TBX (TermBase eXchange) termbase editor, which allows users to create, edit or upload terminological databases in the standard-format TBX (ISO 30042: 2008) and to query their own databases using the Inteliterm tool. Users' responses and assessment of the tool are also provided in Section 4. Finally, Section 5 includes some concluding remarks with suggestions for further improvements.
Studies on Multilingual Lexicography, 2019
©De Gruyter, 2020. Published in Studies on Multilingual Lexicography, edited by María José Domíng... more ©De Gruyter, 2020. Published in Studies on Multilingual Lexicography, edited by María José Domínguez Vázquez, Mónica Mirazo Balsa and Carlos Valcárcel Riveiro, available online at: https://doi.org/10.1515/9783110607659 For information on re-use, please refer to the publisher’s terms and conditions
Aber nun der eigentliche Ubersetzer, der diese beiden ganz getrennten Personen, seinen Schriftste... more Aber nun der eigentliche Ubersetzer, der diese beiden ganz getrennten Personen, seinen Schriftsteller und seinen Leser, wirklich einander zufuhren, und dem letzten, ohne ihn jedoch aus dem Kreise seiner Muttersprache heraus zu notigen, zu einem moglichst richtigen und vollstandigen Verstandnis und Genuβ des ersten verhelfen will, was fur Wege kann er hierzu einschlagen? Meines Erachtens gibt es deren nur zwei. Entweder der Ubersetzer laβt den Schriftsteller moglichts in Ruhe, und bewegt den Leser ihm entgegen; oder er laβt den Leser moglichst in Ruhe und bewegt den Schriftsteller ihm entgegen. (Schleiermacher, 2000: 49)
Computational and Corpus-Based Phraseology, 2019
Corpus-based Translation Studies have promoted research on the features of translated language, b... more Corpus-based Translation Studies have promoted research on the features of translated language, by focusing on the process and product of translation, from a descriptive perspective. Some of these features have been proposed by Toury (1995/2012) under the term of laws of translation, namely the law of growing standardisation and the law of interference. The law of standardisation appears to be particularly at play in diatopy, and more specifically in the case of transnational languages (e.g. English, Spanish, French, German). In fact, some studies have revealed the tendency to standardise the diatopic varieties of Spanish in translated language (Corpas Pastor, 2015a, 2015b, 2017, 2018). This paper focuses on verb + noun (object) collocations of Spanish translations of The Portrait of Dorian Gray by Oscar Wilde. Two different varieties have been chosen (Peninsular and Colombian Spanish). Our main aim is to establish whether the Colombian Spanish translation actually matches the variety spoken in Colombia or it is closer to general or standard Spanish. For this purpose, the techniques used to translate this type of collocations in both Spanish translations will be analysed. Furthermore, the diatopic distribution of these collocations will be studied by means of large corpora.
New Perspectives on Corpus Translation Studies, 2021
This study explores the impact of register on the properties of translations. We compare sources,... more This study explores the impact of register on the properties of translations. We compare sources, translations and non-translated reference texts to describe the linguistic specificity of translations common and unique between four registers. Our approach includes bottom-up identification of translationese effects that can be used to define translations in relation to contrastive properties of each register. The analysis is based on an extended set of frequency features that reflect morphological, syntactic, and text-level characteristics of translations. We also experiment with lexisbased features from n-gram language models estimated on large bodies of originally authored texts from the included registers. Our parallel corpora are built from published English-to-Russian professional translations of general domain mass-media texts, popular scientific books, fiction, and analytical texts on political and economic news. The number of observations and the data sizes for parallel and reference components are comparable within each register and range from 166 (fiction) to 525 (media) text pairs; from 300 K to 1 M tokens. Methodologically, the research relies on a series of supervised and unsupervised machine learning techniques, including those that facilitate visual data exploration. We learn a number of text classification models and study their performance to assess our hypotheses. Further on, we analyse the usefulness of the features for these classifications to detect the best translationese indicators in each register. The multivariate analysis via text classification is complemented by univariate statistical analysis which helps to explain the observed deviation of translated registers through a number of translationese effects and detect the features that contribute to them. Our results demonstrate that each register generates a unique form of translationese that can be only partially explained by cross-linguistic factors. Translated registers differ in the amount and type of prevalent translationese. The same translationese tendencies in different registers are manifested through different features. In particular, the notorious shiningthrough effect is more noticeable in general media texts and news commentary and is less prominent in fiction.
TRANS. Revista de Traductología, 2020
Although interpreting has not yet benefited from technology as much as its sister field, translat... more Although interpreting has not yet benefited from technology as much as its sister field, translation, interest in developing tailor-made solutions for interpreters has risen sharply in recent years. In particular, Automatic Speech Recognition (ASR) is being used as a central component of Computer-Assisted Interpreting (CAI) tools, either bundled or standalone. This study pursues three main aims: (i) to establish the most suitable ASR application for building ad hoc corpora by comparing several ASR tools and assessing their performance; (ii) to use ASR in order to extract terminology from the transcriptions obtained from video-recorded speeches, in this case talks on climate change and adaptation; and (iii) to promote the adoption of ASR as a new documentation tool among interpreters. To the best of our knowledge, this is one of the first studies to explore the possibility of Speech-to-Text (S2T) technology for meeting the preparatory needs of interpreters as regards terminology and ...
The Oxford Handbook of Computational Linguistics 2nd edition, 2015
In today’s market, the use of technology by translators is no longer a luxury but a necessity if ... more In today’s market, the use of technology by translators is no longer a luxury but a necessity if they are to meet rising market demands for the quick delivery of high-quality texts in many languages. This chapter describes a selection of computer-aided translation tools, resources, and applications, most commonly employed by translators to help them increase productivity while maintaining high quality in their work. This chapter also considers some of the ways in which translation technology has influenced the practice and the product of translation, as well as translators’ professional competence and their preferences with regard to tools and resources.
The Oxford Handbook of Computational Linguistics, Second edition (ed. R. Mitkov). Oxford: Oxford University Press. , 2022
Computational and Corpus-Based Phraseology Fourth International Conference, Europhras 2022, Malaga, Spain, September 28-30, 2022, Proceedings. (Eds.: G. Corpas Pastor and R. Mitkov). Springer. , 2022
Mediated Discourse at the European Parliament. “Translation and Multilingual Natural Language Processing” Series (eds. S. Bernardini et al.). Berlin: Language Science Press (ISBN 978-3-96110-393-5)., 2022
This chapter aims at presenting an NLP-enhanced corpus-based analysis of the translation and inte... more This chapter aims at presenting an NLP-enhanced corpus-based analysis of the translation and interpreting shifts observed in the named entities (NEs) of PETI-MOD, an English<>Spanish intermodal corpus of written and oral mediated texts from the Committee on Petitions of the European Parliament. Our main assumption is that shifts in institutional genres mostly occur in the transfer of NEs, and that NLP techniques such as automatic Named Entity Recognition (NER) can be applied to systematically extract and compare examples of these shifts, leading to the (possible) verification of translational and/or interpretational constraints. Results show that traits like normalisation, transformation and simplification depend not only on the language direction or the mediation mode, but also on the semantic category (person, organisation, etc.) of the NE involved. Further studies are needed in order to correlate observed shifts with different NE taxonomies.
Corpora in Translation and Contrastive Research in the Digital Age: Recent advances and explorations, 2021
This chapter provides a brief outline of language technologies applied to translation and interpr... more This chapter provides a brief outline of language technologies applied to translation and interpreting with a view to identifying challenges and research opportunities. Section 1 covers new trends within the automation of processes, the integration of language technologies in translators and interpreters’ workflows and industry demands. Section 2 moves on to other relevant technology-based resources for oral and written mediation, including new types of corpora for interpreting, computer-assisted interpreting tools (CAI) or cloud interpreting, among others. Section 3 delves into the concept of ‘tech-savviness’ and adoption of technology among language service providers (LSPs). Final thoughts are presented as a conclusion in Section 4.
Uploads
Papers by Gloria Corpas Pastor
The 31 full papers presented in this book were carefully reviewed and selected from 116 submissions. The papers in this volume cover a number of topics including general corpus-based approaches to phraseology, phraseology in translation and cross-linguistic studies, phraseology in language teaching and learning, phraseology in specialized languages, phraseology in lexicography, cognitive approaches to phraseology, the computational treatment of multiword expressions, and the development, annotation, and exploitation of corpora for phraseological studies.
This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.
Gloria Corpas Pastor (Universidad de Málaga), Ruslan Mitkov (University of Wolverhampton), Johanna Monti (Università degli Studi di Sassari), and Violeta Seretan (Université de Genève). It received the support
of the Advisory Board, composed of Dmitrij O. Dobrovol'skij (Russian Academy of Sciences, Moscow), Kathrin Steyer (Institut für Deutsche Sprache, Mannheim), Agata Savary (Université François Rabelais
Tours), Michael Rosner (University of Malta), and Carlos Ramisch (Aix-Marseille Université).
The topic of the workshop was the integration of multi-word units in machine translation and translation technology tools. In spite of the recent progress achieved in machine translation and translation
technology, the identification, interpretation and translation of multi-word units still represent open challenges, both from a theoretical and from a practical point of view. The idiosyncratic morpho-syntactic,
semantic and translational properties of multi-word units poses many obstacles even to human translators, mainly because of intrinsic ambiguities, structural and lexical asymmetries between languages, and, finally, cultural differences. After a successful first edition held in Nice on 3 September 2013 as part of the Machine Translation Summit XIV, the present edition provided a forum for researchers working in the fields of Linguistics, Computational Linguistics, Translation Studies and Computational Phraseology to discuss recent advances in the area of multi-word unit processing and to coordinate research efforts across disciplines.
Keywords: Postcolonial literature; Taxonomy; Literary theory; Postcolonial criticism; Plasticity.
Aunque el ámbito de la interpretación no se ha beneficiado de los desarrollos tecnológicos en la misma medida que en traducción, actualmente asistimos al surgimiento de gran interés por desarrollar soluciones adaptadas a las necesidades de los intérpretes. En concreto, el Reconocimiento Automático de Habla (RAH) comienza a ser utilizado como parte de las herramientas de interpretación asistida, bien como componente de tales sistemas o como aplicación autónoma. El presente estudio persigue tres objetivos principales: i) determinar la herramienta de transcripción automática más apropiada para la compilación de corpus ad hoc, comparando diversos sistemas de transcripción automática y evaluando su rendimiento; ii) utilizar RAH para extraer terminología a partir de las transcripciones de discursos orales en vídeo; y iii) promover el uso de RAH como nueva herramienta documental en interpretación. Se trata de uno de los primeros estudios en los que se abordan las posibilidades que ofrece la tecnología habla-texto para cubrir las necesidades terminológicas y documentales de los intérpretes en la fase de preparación de un encargo dado.
PALABRAS CLAVE: transcripción automática, herramientas de interpretación asistida por ordenador, extracción de terminología, corpus ad hoc, tecnologías de la interpretación.
Speech-to-Text Technology as a Documentation Tool for Interpreters: a new approach to compiling an ad hoc corpus and extracting terminology from video-recorded speeches
La tecnología habla-texto como herramienta de documentación para intérpretes: Nuevo método para compilar un corpus ad hoc y extraer terminología a partir de discursos orales en vídeo
MISCELÁNEA 263-281 TRANS.
these circumstances, literature can be transformative and the role of translation as a decolonising tool can help to create unbiased knowledge through an intentionally objective and unprejudiced interpretation of the original texts. We will analyse how those differentiating elements affect the translational process.
A raíz de las interpretaciones postestructuralistas del espacio y siguiendo parcialmente la articulación del tercer espacio de Bhabha, en este artículo hemos acuñado el término cuarto espacio, utilizando este concepto como una herramienta heurística que aborde la necesidad de establecer una postura coherente con el análisis de la recepción de la literatura poscolonial en una sociedad que carezca de una relación inmediata con el proceso de descolonización específico del país del autor. Exploramos el concepto a través de la recepción de la literatura poscolonial africana en España. En este país, dicha perspectiva aún tiene muchas vertientes sin desarrollar en una época en la que la representación de la hibridación está en un momento vital, ya que la representación proporciona el andamiaje social para la construcción de la identidad individual. En este contexto, la literatura se torna transformadora y el rol que asume la traducción como herramienta descolonizadora puede contribuir a crear conocimiento sin sesgos a partir de una interpretación del texto original que se pretende sea objetiva y carente de prejuicios. Se analizará cómo estos elementos diferenciadores afectan al proceso traslativo.