Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1998
…
6 pages
1 file
In this paper the Philips Broadcast News transcription system is described. The Broadcast News task aims at the recognition of found" speech in radio and television broadcasts without any additional side information e.g. speaking style, background conditions. The system was derived from the Philips continuous mixture density crossword HMM system, using MFCC features and Laplacian densities. A segmentation was performed to obtain sentence-like partitions of the broadcasts. Using data-driven clustering, the obtained segments were grouped into clusters with similar acoustic conditions for adaptation purposes. Gender independent w ordinternal and crossword triphone models were trained on 70 hours of the HUB4 training data. No focus condition speci c training was applied. Channel and speaker normalization was done by mean and variance normalization as well as VTN and MLLR. The transcription was produced by an adaptive multiple pass decoder starting with phrase-bigram decoding using word-internal triphones and nishing with a phrasetrigram decoding using MLLR-adapted crossword models.
In this paper the Philips Broadcast News transcription system is described. The Broadcast News task aims at the recognition of found" speech in radio and television broadcasts without any additional side information e.g. speaking style, background conditions. The system was derived from the Philips continuous mixture density crossword HMM system, using MFCC features and Laplacian densities. A segmentation was performed to obtain sentence-like partitions of the broadcasts. Using data-driven clustering, the obtained segments were grouped into clusters with similar acoustic conditions for adaptation purposes. Gender independent w ordinternal and crossword triphone models were trained on 70 hours of the HUB4 training data. No focus condition speci c training was applied. Channel and speaker normalization was done by mean and variance normalization as well as VTN and MLLR. The transcription was produced by an adaptive multiple pass decoder starting with phrase-bigram decoding using word-internal triphones and nishing with a phrasetrigram decoding using MLLR-adapted crossword models.
1998
Radio and television broadcasts consist of a continuous stream of data comprised of segments of different linguistic and acoustic natures, which poses challenges for transcription. In this paper we report on our recent work in transcribing broadcast news data , including the problem of partitioning the data into homogeneous segments prior to word recognition. Gaussian mixture models are used to identify speech and non-speech segments. A maximumlikelihood segmentation/clustering process is then applied to the speech segments using GMMs and an agglomerative clustering algorithm. The clustered segments are then labeled according to bandwidth and gender. The recognizer is a continuous mixture density, tied-state cross-word context-dependent HMM system with a 65k trigram language model. Decoding is carried out in three passes, with a final pass incorporating cluster-based test-set MLLR adaptation. The overall word transcription error on the Nov'97 unpartitioned evaluation test data was 18.5%.
1998
This paper describes extensions and improvements to IBM's large vocabulary continuous speech recognition (LVCSR) system for transcription of broadcast news. The recognizer uses an additional 35 hours of training data over the one used in the 1996 Hub4 evaluation [?]. It includes a number of new features: optimal feature space for acoustic modeling (in training and/or testing), filler-word modeling, Bayesian Information Criterion (BIC) based segment clustering, an improved implementation of iterative MLLR and 4-gram language models. Results using the 1996 DARPA Hub4 evaluation data set are presented.
1998
This paper describes IBM's large vocabulary continuous speech recognition (LVCSR) system used in the 1997 Hub4 English evaluation. It focusses on extensions and improvements to the system used in the 1996 evaluation. The recognizer uses an additional 35 hours of training data over the one used in the 1996 Hub4 evaluation . It includes a number of new features: optimal feature space for acoustic modeling (in training and/or testing), filler-word modeling, Bayesian Information Criterion (BIC) based segmentation and segment clustering, an improved implementation of iterative MLLR, variance adaptation, and 4-gram language models. Results using the 1996 and 1997 DARPA Hub4 evaluation data sets are presented.
Speech Communication, 2002
Over the last few years, the DARPA-sponsored Hub-4 continuous speech recognition evaluations have advanced speech recognition technology for automatic transcription of broadcast news. In this paper, we report on our research and progress in this domain, with an emphasis on efficient modeling with significantly fewer parameters for faster and more accurate recognition. In the acoustic modeling area, this was achieved through new parameter tying, Gaussian clustering, and mixture weight thresholding schemes. The effectiveness of acoustic adaptation is greatly increased through unsupervised clustering of test data. In language modeling, we explored the use of non-broadcast-news training data, as well as adaptation to topic and speaking styles. We developed an effective and efficient parameter pruning technique for backoff language models that allowed us to cope with ever increasing amounts of training data and expanded N-gram scopes. Finally, we improved our progressive search architecture with more efficient algorithms for lattice generation, compaction, and incorporation of higher-order language models.
Interspeech 2005
This paper describes the BBN English Broadcast News transcription system developed for the EARS Rich Transcription 2004 (RT04) evaluation. In comparison to the BBN RT03 system, we achieved around 22% relative reduction in word error rate for all EARS BN development test sets. The use of additional acoustic training data acquired through Light Supervision based on thousands of hours of found data made the biggest contribution to the improvement. Better audio segmentation, through the use of an online speaker clustering algorithm and chopping speaker turns into moderately long utterances, also contributed substantially to the improvement. Other contributions, even of modest size but adding up nicely, include using discriminative training for all acoustic models, using word duration as an additional knowledge source during N-best rescoring, and using updated lexicon and language models.
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997
While significant improvements have been made over the last 5 years in large vocabulary continuous speech recognition of large read-speech corpora such as the ARPA Wall Street Journal-based CSR corpus (WSJ) for American English and the BREF corpus for French, these tasks remain relatively artificial. In this paper we report on our development work in moving from laboratory read speech data to real-world speech data in order to build a system for the new ARPA broadcast news transcription task.
Darshanim. Interpretazione: reti di relazioni generate da un’opera d’arte, vol. 2, 2022
In questo saggio (contenuto in P.A. Porceddu Cilione (a cura di), Darshanim. Interpretazione: reti di relazioni generate da un’opera d’arte, vol. 2, Mimesis, Milano-Udine 2022, pp. 45-66) sviluppo un lungo ragionamento che conduce dal disincanto alla risonanza, passando attraverso la singolare capacità della musica di farci fare esperienza di una variante di causalità sui generis. Per cominciare, invito il lettore a scavare nella propria memoria e rispolverare un’esperienza personale di incanto – non importa se nella natura, nello sport, all’interno di un rituale o di una relazione – che funga da contesto per il successivo sforzo interpretativo. Mi interessa, in particolare, il processo che conduce regolarmente dall’attimo in cui le cose ci appaiono sotto una nuova luce, dotate di potenzialità che sarebbero sembrate irrealistiche solo poche ore prima, alla successiva disillusione. A prima vista, potrebbe sembrare che questa sia semplicemente l'interpretazione corretta della situazione. Nel saggio, tuttavia, provo a sondare la solidità delle intuizioni su cui poggia tale giudizio pessimistico investigando tre questioni interconnesse. In primo luogo, cerco di capire meglio perché le esperienze di incanto debbano piegare sempre il capo e cedere il passo al disincanto: si tratta forse di una proprietà intrinseca della condizione umana e, nel caso, con che tipo di ineludibilità abbiamo a che fare qui? In secondo luogo, visto che gli episodi di incanto comunque accadono, vorrei capire quale tipo di potenziale incarnino, usando la musica come esempio paradigmatico. Infine, prendendo spunto dalla metafora weberiana della «religiöse Musikalität» – l’avere orecchio per la religione – cercherò di stabilire se nell’esperienza della risonanza esistano i presupposti per modificare l’equilibrio tra incanto e disincanto e il rispettivo peso che i due opposti modi di entrare in relazione col mondo hanno nelle vite degli individui moderni.
Management theories are implemented to help increase organizational productivity and service quality. Not many managers use a singular theory or concept when implementing strategies in the workplace: They commonly use a combination of a number of theories, depending on the workplace, purpose and workforce. Contingency theory, chaos theory and systems theoryare popular management theories. Theory X and Y, which addresses management strategies for workforce motivation, is also implemented to help increase worker productivities.
Applied Economics Letters, 2022
Democracies are expected to implement expansionary fiscal policies to accelerate economic recovery in crisis times. However, as experienced in the 2008 global financial crisis, democratic countries sometimes tend to adopt austerity measures during crises. This paper examines the relationship between democracy and fiscal support during the COVID-19 pandemic. The empirical findings confirm the positive impact of democracy on fiscal support. As we unbundle the effects, democracy has a significant effect for the non-health sector, but the effect is insignificant for the health sector. https://www.tandfonline.com/doi/full/10.1080/13504851.2022.2120950
Philósophos - Revista de Filosofia
Anuario Colombiano de Historia Social y de la Cultura, 2007
Journal de Physique IV (Proceedings), 2005
2004
Journal of Business Diversity, 2024
Neuropsychopharmacologia Hungarica : a Magyar Pszichofarmakológiai Egyesület lapja = official journal of the Hungarian Association of Psychopharmacology, 2016
Caries Research, 2011
Cognitive Semantics
Asian Journal of Animal and Veterinary Advances, 2015
bioRxiv (Cold Spring Harbor Laboratory), 2021
Österreichische Wasser- und Abfallwirtschaft, 2017
Journal of Tribology, 1992
Journal of Vacuum Science and Technology
Proceedings of the National Academy of Sciences of the United States of America, 2017