Word error rate
10 Followers
Recent papers in Word error rate
The estimation of initial language models for new applications of spoken dialogue systems without large taskspecific training corpora is becoming an increasingly important issue. This paper investigates two different approaches in which... more
Despite 35 years of R&D on the problem of Optical character Recognition (OCR), the technology is not yet mature enough for the Arabic font-written script compared with Latin-based ones. There is still a wide room for enhancements as per:... more
Individual optical character recognition (OCR) engines vary in the types of errors they commit in recognizing text, particularly poor quality text. By aligning the output of multiple OCR engines and taking advantage of the differences... more
As the use of Internet broadcasting (webcasting) increases, more webcasts will be archived and accessed numerous times retrospectively. One challenge in skimming and browsing through such archives is the lack of textual transcripts of the... more
Many groups have investigated the relationship of word error rate and perplexity of language models. This issue is of central interest because perplexity optimization can be done independent of a recognizer and in most cases it is... more
We would also like to thank the members of the Program Committee for completing their reviews promptly, and for providing useful feedback for deciding on the program and preparing the final versions of the papers. Thanks also to Marie... more
Speech is the most natural form of human communication and speech processing has been one of the most inspiring expanses of signal processing. Speech recognition is the process of automatically recognizing the spoken words of person based... more
In this paper, we investigate pilot-symbol-aided parameter estimation for orthogonal frequency division multiplexing (OFDM) systems. We first derive a minimum mean-square error (MMSE) pilot-symbol-aided parameter estimator. Then, we... more
The majority of state-of-the-art speech recognition systems make use of system combination. The combination approaches adopted have traditionally been tuned to minimising Word Error Rates (WERs). In recent years there has been growing... more
An MLP classifier outputs a posterior probability for each class. With noisy data, classification becomes less certain, and the entropy of the posteriors distribution tends to increase providing a measure of classification confidence.... more
— Speech is the most natural form of human communication and speech processing has been one of the most inspiring expanses of signal processing. Speech recognition is the process of automatically recognizing the spoken words of person... more
In this paper, we investigate pilot-symbol-aided parameter estimation for orthogonal frequency division multiplexing (OFDM) systems. We first derive a minimum mean-square error (MMSE) pilot-symbol-aided parameter estimator. Then, we... more
Building an automatic speech recognition (ASR) system from scratch requires a large amount of annotated speech data, which is difficult to collect in many languages. However, there are cases where the low-resource language shares a common... more
In addition to ordinary words and names, real text contains non-standard “words" (NSWs), including numbers, abbreviations, dates, currency amounts and acronyms. Typically, one cannot find NSWs in a dictionary, nor can one find their... more
Automatic segmentation of these audio streams according to speaker identities, environmental and channel conditions has be-come an important preprocessing step for speech recognition, speaker recognition, and audio data mining [7], [8],... more
We propose grapheme-based sub-word units for spoken term detection (STD). Compared to phones, graphemes have a number of potential advantages. For out-of-vocabulary search terms, phonebased approaches must generate a pronunciation using... more
Incorporating the concept of the syllable into speech recognition may improve recognition accuracy through the integration of information over syllable-length time spans. Evidence from psychoacoustics and phonology suggests that humans... more
This paper presents our work towards developing a new speech corpus for Modern Standard Arabic (MSA), which can be used for implementing and evaluating Arabic speaker-independent, large vocabulary, automatic, and continuous speech... more
In this paper we present techniques for building multi-domain and multi-lingual recognizers within a finite-state transducer (FST) framework. The flexibility of the FST approach is also demonstrated on the task of incorporating networks... more
In this paper we present a quantitative investigation into the impact of text normalization on lexica and language models for speech recognition in French. The text normalization process defines what is considered to be a word by the... more
In this paper, we present a novel approach for morphological decomposition in large vocabulary Arabic speech recognition. It achieved low out-of-vocabulary (OOV) rate as well as high recognition accuracy in a state-of-the-art Arabic... more
This paper describes first results of our DARPA-sponsored efforts toward recognizing and browsing foreign language, more specifically, Serbo-Croatian broadcast news. For Serbo-Croatian as well as many other than the most common well... more
Debates in the European Parliament are simultaneously translated into the official languages of the Union. These interpretations are broadcast live via satellite on separate audio channels. After several months, the parliamentary... more
Several real world applications of humanoids in general will require continuous service over a long time period. A humanoid robot operating in different environments over a long period of time means that A) there will be a lot of... more
This paper describes a machine translation system that offers many deaf and hearing-impaired people the chance to access pub-lished information in Arabic by translating text into their first language, Arabic Sign Lan-guage (ArSL). The... more
In this paper we investigate the integration of a confusion network into an on-line handwritten sentence recognition system. The word posterior probabilities from the confusion network are used as confidence scored to detect potential... more
The paper investigates the integration of Heteroscedastic Linear Discriminant Analysis (HLDA) into adaptively trained speech recognizers. Two different approaches are compared: the first is a variant of CMLLR-SAT, the second is based on... more
This paper presents a set of experiments used to develop a statistical system from translating speech to sign language for deaf people. This system is composed of an Automatic Speech Recognition (ASR) system, followed by a statistical... more
Sequence recognition performance is often summarised first in terms of the number of hits (H), substitutions (S), deletions (D) and insertions (I), and then as a single statistic by the "word error rate" WER = 100(S+D+I)/(H+S+D). While in... more
Speech comprises a variety of acoustical phenomena occurring at differing rates. Fixed-rate ASR systems assume in effect a constant temporal rate of information flow via incorporating uniform statistics in proportion to a sound's... more
Face-to-face meetings usually encompass several modalities including speech, gesture, handwriting, and person identification. Recognition and integration of each of these modalities is important to create an accurate record of a meeting.... more
In this paper we report on new developments in the automatic meeting transcription task. Unlike other types of speech (such as those found in Broadcast News and Switchboard), meetings are unique in their richer dynamics of human-to-human... more
In this study, we propose an algorithm for Arabic isolated digit recognition. The algorithm is based on extracting acoustical features from the speech signal and using them as input to multi-layer perceptrons neural networks. Each word in... more
A method to automatically annotate video items with se-mantic metadata is presented. The method has been devel-oped in the context of the Papyrus project to annotate docu-mentary-like broadcast videos with a set of relevant keywords using... more
In this paper we describe the English Conversational Telephone Speech (CTS) recognition system jointly developed by BBN and LIMSI under the DARPA EARS program for the 2004 evaluation conducted by NIST. The 2004 BBN/LIMSI system achieved a... more
Cet article décrit une méthode qui combine des hypothèses graphémiques et phonétiques au niveau de la phrase, à l’aide d’une réprésentation en automates à états finis et d’un modèle de langage, pour la réécriture de phrases tapées au... more
In ths paper, we present our research on dialog dependent language modeling. In accordance with a speech (or sentence) production model in a discourse we split language modeling into two components; namely, dialog dependent concept... more
In this paper, a task of human-machine interaction based on speech is presented. The specific task consists on the use and control of a set of home appliances through a turnbased dialogue system. This work focuses on the first part of the... more
In this paper, a novel speaker normalization method is presented and compared to a well known vocal tract length normaliza- tion method. With this method, acoustic observations of train- ing and testing speakers are mapped into a... more
This paper presents some recent improvements in automatic transcription of Italian broadcast news obtained at ITCirst.
In this paper we present a number of improvements that were recently made to the template based speech recognition system developed at ESAT. Combining these improvements resulted in a decrease in word error rate from 9.6% to 8.2% on the... more
The paper deals with the development of acoustic models of foreign words for a German speech recognizer. The recognition quality of foreign words is crucial for the overall performance of a system in application fields like spoken... more
In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic model adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a... more