Publication Details

Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition

BHATTACHARJEE, M.; NIGMATULINA, I.; PRASAD, A.; RANGAPPA, P.; MADIKERI, S.; MOTLÍČEK, P.; HELMKE, H.; KLEINERT, M. Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 12652-12656. ISBN: 979-8-3503-4485-1.
Czech title
Metody kontextového ovlivnění pro zlepšení detekce neobvyklých slov v automatickém rozpoznávání řeči
Type
conference paper
Language
English
Authors
BHATTACHARJEE, M.
NIGMATULINA, I.
Prasad Amrutha (DCGM)
RANGAPPA, P.
Madikeri Srikanth
Motlíček Petr, doc. Ing., Ph.D. (DCGM)
HELMKE, H.
KLEINERT, M.
URL
Keywords

Automatic speech recognition, air traffic control, domain adaptation, contextual biasing, rare word recognition

Abstract

In specialized domains like Air Traffic Control (ATC), a
notable challenge in porting a deployed Automatic Speech
Recognition (ASR) system from one airport to another is
the alteration in the set of crucial words that must be ac-
curately detected in the new environment. Typically, such
words have limited occurrences in training data, making it
impractical to retrain the ASR system. This paper explores
innovative word-boosting techniques to improve the detec-
tion rate of such rare words in the ASR hypotheses for the
ATC domain. Two acoustic models are investigated: a hybrid
CNN-TDNNF model trained from scratch and a pre-trained
wav2vec2-based XLSR model fine-tuned on a common ATC
dataset. The word boosting is done in three ways. First, an
out-of-vocabulary word addition method is explored. Second,
G-boosting is explored, which amends the language model
before building the decoding graph. Third, the boosting is
performed on the fly during decoding using lattice re-scoring.
The results indicate that the G-boosting method performs best
and provides an approximately 30-43% relative improvement
in recall of the boosted words. Moreover, a relative improve-
ment of up to 48% is obtained upon combining G-boosting
and lattice-rescoring

Published
2024
Pages
12652–12656
Proceedings
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference
2024 IEEE International Conference on Acoustics, Speech and Signal Processing IEEE, Seoul, KR
ISBN
979-8-3503-4485-1
Publisher
IEEE Signal Processing Society
Place
Seoul
DOI
EID Scopus
BibTeX
@inproceedings{BUT193355,
  author="BHATTACHARJEE, M. and NIGMATULINA, I. and PRASAD, A. and RANGAPPA, P. and MADIKERI, S. and MOTLÍČEK, P. and HELMKE, H. and KLEINERT, M.",
  title="Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition",
  booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
  year="2024",
  pages="12652--12656",
  publisher="IEEE Signal Processing Society",
  address="Seoul",
  doi="10.1109/ICASSP48485.2024.10447465",
  isbn="979-8-3503-4485-1",
  url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10447465"
}
Files
Back to top