Result Details

Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition

BHATTACHARJEE, M.; NIGMATULINA, I.; PRASAD, A.; RANGAPPA, P.; MADIKERI, S.; MOTLÍČEK, P.; HELMKE, H.; KLEINERT, M. Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024. p. 12652-12656. ISBN: 979-8-3503-4485-1.
Type
conference paper
Language
English
Authors
BHATTACHARJEE, M.
NIGMATULINA, I.
Prasad Amrutha
RANGAPPA, P.
Madikeri Srikanth, FIT (FIT)
Motlíček Petr, doc. Ing., Ph.D., DCGM (FIT)
HELMKE, H.
KLEINERT, M.
Abstract

In specialized domains like Air Traffic Control (ATC), a
notable challenge in porting a deployed Automatic Speech
Recognition (ASR) system from one airport to another is
the alteration in the set of crucial words that must be ac-
curately detected in the new environment. Typically, such
words have limited occurrences in training data, making it
impractical to retrain the ASR system. This paper explores
innovative word-boosting techniques to improve the detec-
tion rate of such rare words in the ASR hypotheses for the
ATC domain. Two acoustic models are investigated: a hybrid
CNN-TDNNF model trained from scratch and a pre-trained
wav2vec2-based XLSR model fine-tuned on a common ATC
dataset. The word boosting is done in three ways. First, an
out-of-vocabulary word addition method is explored. Second,
G-boosting is explored, which amends the language model
before building the decoding graph. Third, the boosting is
performed on the fly during decoding using lattice re-scoring.
The results indicate that the G-boosting method performs best
and provides an approximately 30-43% relative improvement
in recall of the boosted words. Moreover, a relative improve-
ment of up to 48% is obtained upon combining G-boosting
and lattice-rescoring

Keywords

Automatic speech recognition, air traffic control, domain adaptation, contextual biasing, rare word recognition

URL
Published
2024
Pages
12652–12656
Proceedings
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference
2024 IEEE International Conference on Acoustics, Speech and Signal Processing IEEE
ISBN
979-8-3503-4485-1
Publisher
IEEE Signal Processing Society
Place
Seoul
DOI
EID Scopus
BibTeX
@inproceedings{BUT193355,
  author="BHATTACHARJEE, M. and NIGMATULINA, I. and PRASAD, A. and RANGAPPA, P. and MADIKERI, S. and MOTLÍČEK, P. and HELMKE, H. and KLEINERT, M.",
  title="Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition",
  booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
  year="2024",
  pages="12652--12656",
  publisher="IEEE Signal Processing Society",
  address="Seoul",
  doi="10.1109/ICASSP48485.2024.10447465",
  isbn="979-8-3503-4485-1",
  url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10447465"
}
Files
Projects
Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-23-8278, start: 2023-03-01, end: 2026-02-28, running
Research groups
Departments
Back to top