Result Details
Front-End Compensation Methods for LVCSR Under Lombard Effect
This paper describes a Front-End Compensation Methods for LVCSR (Large Vocabulary Continuous Speech Recognition) Under Lombard Effect.
speech recognition, Lombard effect, UT-Scopedatabase, bottleneck features, quantile-based cepstral distribution normalization,histogram equalization
This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various levels of SNR. An extension of a bottleneck (BN) front-end utilizing normalization of both critical band energies (CRBE) and BN outputs is proposed and shown to provide a competitive performance compared to the best MFCC-based system. A novel MFCC-based BN front-end is introduced and shown to outperform all other systems in all conditions considered (average 4.1% absolute WER reduction over the second best system). Additionally, two phenomena are observed: (i) combination of cepstral mean subtraction and recently established RASTALP filtering significantly reduces transient effects of RASTA band-pass filtering and increases ASR robustness to noise and LE; (ii) histogram equalization may benefit from utilizing reference distributions derived from pre-normalized rather than raw training features, and also from adopting distributions from different front-ends.
@inproceedings{BUT76449,
author="Hynek {Bořil} and František {Grézl} and John {Hansen}",
title="Front-End Compensation Methods for LVCSR Under Lombard Effect",
booktitle="Proceedings of Interspeech 2011",
year="2011",
journal="Proceedings of Interspeech",
volume="2011",
number="8",
pages="1257--1260",
publisher="International Speech Communication Association",
address="Florence",
isbn="978-1-61839-270-1",
issn="1990-9772",
url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/boril_interspeech2011_221.pdf"
}
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Theory and applications of phoneme posterior estimation in speech processing, GACR, Doktorské granty, GP102/09/P635, start: 2009-01-01, end: 2011-12-31, completed