Publication Details

Front-End Compensation Methods for LVCSR Under Lombard Effect

BOŘIL Hynek, GRÉZL František and HANSEN John H. Front-End Compensation Methods for LVCSR Under Lombard Effect. In: Proceedings of Interspeech 2011. Florence: International Speech Communication Association, 2011, pp. 1257-1260. ISBN 978-1-61839-270-1. ISSN 1990-9772.
Czech title
Kompenzační techniky Front-Endu pro LVCSR řeči ovlivněné Lombardovým efektem
Type
conference paper
Language
english
Authors
Bořil Hynek (UTDALLAS)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Hansen John H. (UTDALLAS)
URL
Keywords

speech recognition, Lombard effect, UT-Scope database, bottleneck features, quantile-based cepstral distribution normalization, histogram equalization

Abstract

This paper describes a Front-End Compensation Methods for LVCSR (Large Vocabulary Continuous Speech Recognition) Under Lombard Effect.

Annotation

This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various levels of SNR. An extension of a bottleneck (BN) front-end utilizing normalization of both critical band energies (CRBE) and BN outputs is proposed and shown to provide a competitive performance compared to the best MFCC-based system. A novel MFCC-based BN front-end is introduced and shown to outperform all other systems in all conditions considered (average 4.1% absolute WER reduction over the second best system). Additionally, two phenomena are observed: (i) combination of cepstral mean subtraction and recently established RASTALP filtering significantly reduces transient effects of RASTA band-pass filtering and increases ASR robustness to noise and LE; (ii) histogram equalization may benefit from utilizing reference distributions derived from pre-normalized rather than raw training features, and also from adopting distributions from different front-ends.

Published
2011
Pages
1257-1260
Journal
Proceedings of Interspeech, vol. 2011, no. 8, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2011
Conference
Interspeech 2011, Florence Italy, IT
ISBN
978-1-61839-270-1
Publisher
International Speech Communication Association
Place
Florence, IT
BibTeX
@INPROCEEDINGS{FITPUB9756,
   author = "Hynek Bo\v{r}il and Franti\v{s}ek Gr\'{e}zl and H. John Hansen",
   title = "Front-End Compensation Methods for LVCSR Under Lombard Effect",
   pages = "1257--1260",
   booktitle = "Proceedings of Interspeech 2011",
   journal = "Proceedings of Interspeech",
   volume = 2011,
   number = 8,
   year = 2011,
   location = "Florence, IT",
   publisher = "International Speech Communication Association",
   ISBN = "978-1-61839-270-1",
   ISSN = "1990-9772",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9756"
}
Back to top