Skip to main content
The estimation of initial language models for new applications of spoken dialogue systems without large taskspecific training corpora is becoming an increasingly important issue. This paper investigates two different approaches in which... more
    • by 
    •   15  
      Educational TechnologySpeech RecognitionLinguisticsStochastic processes
Despite 35 years of R&D on the problem of Optical character Recognition (OCR), the technology is not yet mature enough for the Arabic font-written script compared with Latin-based ones. There is still a wide room for enhancements as per:... more
    • by 
    •   8  
      EngineeringPattern RecognitionAutomatic Speech RecognitionFeature Extraction
Individual optical character recognition (OCR) engines vary in the types of errors they commit in recognizing text, particularly poor quality text. By aligning the output of multiple OCR engines and taking advantage of the differences... more
    • by 
    •   5  
      State SpaceBit Error RateOptical Character RecognitionLower Bound
As the use of Internet broadcasting (webcasting) increases, more webcasts will be archived and accessed numerous times retrospectively. One challenge in skimming and browsing through such archives is the lack of textual transcripts of the... more
    • by 
    •   3  
      Computer ScienceAutomatic Speech RecognitionWord error rate
Many groups have investigated the relationship of word error rate and perplexity of language models. This issue is of central interest because perplexity optimization can be done independent of a recognizer and in most cases it is... more
    • by 
    •   9  
      Cognitive ScienceLinguisticsSpeech CommunicationClose relationships
We would also like to thank the members of the Program Committee for completing their reviews promptly, and for providing useful feedback for deciding on the program and preparing the final versions of the papers. Thanks also to Marie... more
    • by 
    •   3  
      Machine TranslationBit Error RateWord error rate
Speech is the most natural form of human communication and speech processing has been one of the most inspiring expanses of signal processing. Speech recognition is the process of automatically recognizing the spoken words of person based... more
    • by 
    •   6  
      Artificial IntelligenceSpeech RecognitionMFCCFeature Extraction
In this paper, we investigate pilot-symbol-aided parameter estimation for orthogonal frequency division multiplexing (OFDM) systems. We first derive a minimum mean-square error (MMSE) pilot-symbol-aided parameter estimator. Then, we... more
    • by 
    •   6  
      EngineeringTechnologyWireless SystemsParameter estimation
    • by 
    •   15  
      Information RetrievalInformation TheoryProper NamesNatural language
The majority of state-of-the-art speech recognition systems make use of system combination. The combination approaches adopted have traditionally been tuned to minimising Word Error Rates (WERs). In recent years there has been growing... more
    • by 
    •   18  
      Computer ScienceEnglish languageMachine TranslationSpeech Acoustics
    • by 
    •   12  
      Computer ScienceSpeech RecognitionParameter estimationTraining data
An MLP classifier outputs a posterior probability for each class. With noisy data, classification becomes less certain, and the entropy of the posteriors distribution tends to increase providing a measure of classification confidence.... more
    • by 
    •   4  
      Posterior distributionConfusion MatrixNoisy DataWord error rate
— Speech is the most natural form of human communication and speech processing has been one of the most inspiring expanses of signal processing. Speech recognition is the process of automatically recognizing the spoken words of person... more
    • by 
    •   6  
      Artificial IntelligenceSpeech RecognitionMFCCFeature Extraction
In this paper, we investigate pilot-symbol-aided parameter estimation for orthogonal frequency division multiplexing (OFDM) systems. We first derive a minimum mean-square error (MMSE) pilot-symbol-aided parameter estimator. Then, we... more
    • by 
    •   6  
      EngineeringTechnologyWireless SystemsParameter estimation
    • by 
    •   12  
      Computer ScienceAutomatic Speech RecognitionSpeech RecognitionLinguistics
Building an automatic speech recognition (ASR) system from scratch requires a large amount of annotated speech data, which is difficult to collect in many languages. However, there are cases where the low-resource language shares a common... more
    • by 
    •   14  
      Machine LearningAcoustic ModellingSanskritAutomatic Speech Recognition
In addition to ordinary words and names, real text contains non-standard “words" (NSWs), including numbers, abbreviations, dates, currency amounts and acronyms. Typically, one cannot find NSWs in a dictionary, nor can one find their... more
    • by 
    •   9  
      Real EstateCognitive ScienceComputer ScienceSpeech Recognition
Automatic segmentation of these audio streams according to speaker identities, environmental and channel conditions has be-come an important preprocessing step for speech recognition, speaker recognition, and audio data mining [7], [8],... more
    • by 
    •   16  
      Data MiningSpeaker RecognitionAudio Signal ProcessingSpeech Recognition
We propose grapheme-based sub-word units for spoken term detection (STD). Compared to phones, graphemes have a number of potential advantages. For out-of-vocabulary search terms, phonebased approaches must generate a pronunciation using... more
    • by 
    •   8  
      Statistical AnalysisSpeech AcousticsSpeech ProcessingSpoken Term Detection
Incorporating the concept of the syllable into speech recognition may improve recognition accuracy through the integration of information over syllable-length time spans. Evidence from psychoacoustics and phonology suggests that humans... more
    • by 
    •   18  
      PsychologyComputer SciencePsychoacousticsPsycholinguistics
This paper presents our work towards developing a new speech corpus for Modern Standard Arabic (MSA), which can be used for implementing and evaluating Arabic speaker-independent, large vocabulary, automatic, and continuous speech... more
    • by 
    •   19  
      Applied MathematicsComputer ScienceComputational LinguisticsAutomatic Speech Recognition
In this paper we present techniques for building multi-domain and multi-lingual recognizers within a finite-state transducer (FST) framework. The flexibility of the FST approach is also demonstrated on the task of incorporating networks... more
    • by 
    •   2  
      Network ModelWord error rate
In this paper we present a quantitative investigation into the impact of text normalization on lexica and language models for speech recognition in French. The text normalization process defines what is considered to be a word by the... more
    • by 
    •   4  
      TrainingSpeech RecognitionLanguage ModelWord error rate
In this paper, we present a novel approach for morphological decomposition in large vocabulary Arabic speech recognition. It achieved low out-of-vocabulary (OOV) rate as well as high recognition accuracy in a state-of-the-art Arabic... more
    • by 
    •   15  
      Computer ScienceAcousticsData MiningAutomatic Speech Recognition
    • by 
    •   15  
      Computer ScienceSpeech RecognitionEntropyHidden Markov Models
This paper describes first results of our DARPA-sponsored efforts toward recognizing and browsing foreign language, more specifically, Serbo-Croatian broadcast news. For Serbo-Croatian as well as many other than the most common well... more
    • by 
    •   7  
      Foreign LanguageSpeech SegmentationData CollectionMorphological Variation
Debates in the European Parliament are simultaneously translated into the official languages of the Union. These interpretations are broadcast live via satellite on separate audio channels. After several months, the parliamentary... more
    • by 
    •   4  
      Computer ScienceSpeech RecognitionEuropean ParliamentWord error rate
Several real world applications of humanoids in general will require continuous service over a long time period. A humanoid robot operating in different environments over a long period of time means that A) there will be a lot of... more
    • by 
    •   10  
      Computer ScienceSpeech RecognitionSocial IntegrationBit Error Rate
This paper describes a machine translation system that offers many deaf and hearing-impaired people the chance to access pub-lished information in Arabic by translating text into their first language, Arabic Sign Lan-guage (ArSL). The... more
    • by 
    •   3  
      Machine TranslationBit Error RateWord error rate
In this paper we investigate the integration of a confusion network into an on-line handwritten sentence recognition system. The word posterior probabilities from the confusion network are used as confidence scored to detect potential... more
    • by 
    •   5  
      Graph TheorySupport Vector MachinesProbabilityImage recognition
    • by 
    •   16  
      Iterative MethodsAutomatic Speech RecognitionSpeech RecognitionSpeech
The paper investigates the integration of Heteroscedastic Linear Discriminant Analysis (HLDA) into adaptively trained speech recognizers. Two different approaches are compared: the first is a variant of CMLLR-SAT, the second is based on... more
    • by 
    •   10  
      Speaker RecognitionSpeech RecognitionBroadcastingLinear Discriminant Analysis
This paper presents a set of experiments used to develop a statistical system from translating speech to sign language for deaf people. This system is composed of an Automatic Speech Recognition (ASR) system, followed by a statistical... more
    • by 
    •   8  
      Computer ScienceSign LanguageAutomatic Speech RecognitionStatistical Machine Translation
Sequence recognition performance is often summarised first in terms of the number of hits (H), substitutions (S), deletions (D) and insertions (I), and then as a single statistic by the "word error rate" WER = 100(S+D+I)/(H+S+D). While in... more
    • by 
    •   8  
      MathematicsComputer ScienceSpeechMutual Information
Speech comprises a variety of acoustical phenomena occurring at differing rates. Fixed-rate ASR systems assume in effect a constant temporal rate of information flow via incorporating uniform statistics in proportion to a sound's... more
    • by 
    •   9  
      Computer ScienceSpeech RecognitionSpeech ProcessingTime-Frequency Analysis
Face-to-face meetings usually encompass several modalities including speech, gesture, handwriting, and person identification. Recognition and integration of each of these modalities is important to create an accurate record of a meeting.... more
    • by 
    •   8  
      Computer ScienceSpeech RecognitionProceedingsFace to Face
In this paper we report on new developments in the automatic meeting transcription task. Unlike other types of speech (such as those found in Broadcast News and Switchboard), meetings are unique in their richer dynamics of human-to-human... more
    • by 
    •   6  
      Computer ScienceProper NamesHuman InteractionLanguage Model
In this study, we propose an algorithm for Arabic isolated digit recognition. The algorithm is based on extracting acoustical features from the speech signal and using them as input to multi-layer perceptrons neural networks. Each word in... more
    • by 
    •   10  
      Computer ScienceSpeech RecognitionNeural NetworkWord Recognition
A method to automatically annotate video items with se-mantic metadata is presented. The method has been devel-oped in the context of the Papyrus project to annotate docu-mentary-like broadcast videos with a set of relevant keywords using... more
    • by 
    •   12  
      Computer ScienceElectronic publishingSemanticsAudio Signal Processing
In this paper we describe the English Conversational Telephone Speech (CTS) recognition system jointly developed by BBN and LIMSI under the DARPA EARS program for the 2004 evaluation conducted by NIST. The 2004 BBN/LIMSI system achieved a... more
    • by 
    •   5  
      Computer ScienceSystem ArchitectureReal TimeWord error rate
Cet article décrit une méthode qui combine des hypothèses graphémiques et phonétiques au niveau de la phrase, à l’aide d’une réprésentation en automates à états finis et d’un modèle de langage, pour la réécriture de phrases tapées au... more
    • by 
    •   8  
      Computer ScienceInformation RetrievalArtUser needs
In ths paper, we present our research on dialog dependent language modeling. In accordance with a speech (or sentence) production model in a discourse we split language modeling into two components; namely, dialog dependent concept... more
    • by 
    •   20  
      ProductionSpeech RecognitionStatistical ModelingProbability
In this paper, a task of human-machine interaction based on speech is presented. The specific task consists on the use and control of a set of home appliances through a turnbased dialogue system. This work focuses on the first part of the... more
    • by 
    •   13  
      MultimediaAutomatic Speech RecognitionHome automationDialogue System
    • by 
    •   12  
      Speech RecognitionControl SystemsSpeech ProcessingSignal Analysis
In this paper, a novel speaker normalization method is presented and compared to a well known vocal tract length normaliza- tion method. With this method, acoustic observations of train- ing and testing speakers are mapped into a... more
    • by 
    •   4  
      hidden Markov modelVocal Tract LengthAffine TransformationWord error rate
This paper presents some recent improvements in automatic transcription of Italian broadcast news obtained at ITCirst.
    • by 
    •   5  
      Speech SegmentationSystem performanceBroadcast newsAcoustic Modeling
In this paper we present a number of improvements that were recently made to the template based speech recognition system developed at ESAT. Combining these improvements resulted in a decrease in word error rate from 9.6% to 8.2% on the... more
    • by 
    •   14  
      Speech AcousticsAutomatic Speech RecognitionSpeech RecognitionDatabases
The paper deals with the development of acoustic models of foreign words for a German speech recognizer. The recognition quality of foreign words is crucial for the overall performance of a system in application fields like spoken... more
    • by 
    •   8  
      Computer ScienceSpeech RecognitionProper NamesGerman
In this paper, pronunciation variability between native and non-native speakers is investigated, and a novel acoustic model adaptation method is proposed based on pronunciation variability analysis in order to improve the performance of a... more
    • by 
    •   17  
      Cognitive ScienceComputer SciencePhoneticsEnglish
    • by 
    •   11  
      Computer ScienceArtificial IntelligenceSpeech RecognitionVietnam