Publication Details

Implementing contextual biasing in GPU decoder for online ASR

NIGMATULINA Iuliia, MADIKERI Srikanth, VILLATORO-TELLO Esaú, MOTLÍČEK Petr, ZULUAGA-GOMEZ Juan, PANDIA Karthick and GANAPATHIRAJU Aravind. Implementing contextual biasing in GPU decoder for online ASR. In: Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH. Dublin: International Speech Communication Association, 2023, pp. 4494-4498. ISSN 1990-9772. Available from: https://www.isca-archive.org/interspeech_2023/nigmatulina23_interspeech.html
Czech title
Implementace kontextové předpojatosti (biasu) v GPU dekodéru pro online ASR
Type
conference paper
Language
english
Authors
Nigmatulina Iuliia (IDIAP)
Madikeri Srikanth (IDIAP)
Villatoro-tello Esaú (IDIAP)
Motlíček Petr, doc. Ing., Ph.D. (DCGM FIT BUT)
Zuluaga-Gomez Juan (IDIAP)
Pandia Karthick ()
Ganapathiraju Aravind ()
URL
Keywords

real-time speech recognition, contextual adaptation, GPU decoding, finite-state transducers

Abstract

GPU decoding significantly accelerates the output of ASR predictions. While GPUs are already being used for online ASR decoding, post-processing and rescoring on GPUs have not been properly investigated yet. Rescoring with available contextual information can considerably improve ASR predictions. Previous studies have proven the viability of lattice rescoring in decoding and biasing language model (LM) weights in offline and online CPU scenarios. In real-time GPU decoding, partial recognition hypotheses are produced without lattice generation, which makes the implementation of biasing more complex. The paper proposes and describes an approach to integrate contextual biasing in real-time GPU decoding while exploiting the standard Kaldi GPU decoder. Besides the biasing of partial ASR predictions, our approach also permits dynamic context switching allowing a flexible rescoring per each speech segment directly on GPU. The code is publicly released1 and tested with open-sourced test sets.

Published
2023
Pages
4494-4498
Journal
Proceedings of Interspeech - on-line, vol. 2023, no. 8, ISSN 1990-9772
Proceedings
Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH
Conference
Interspeech Conference, Dublin, IE
Publisher
International Speech Communication Association
Place
Dublin, IE
DOI
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB13155,
   author = "Iuliia Nigmatulina and Srikanth Madikeri and Esa\'{u} Villatoro-tello and Petr Motl\'{i}\v{c}ek and Juan Zuluaga-Gomez and Karthick Pandia and Aravind Ganapathiraju",
   title = "Implementing contextual biasing in GPU decoder for online ASR",
   pages = "4494--4498",
   booktitle = "Proceedings of the Annual Conference of International Speech Communication Association, INTERSPEECH",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2023,
   number = 8,
   year = 2023,
   location = "Dublin, IE",
   publisher = "International Speech Communication Association",
   ISSN = "1990-9772",
   doi = "10.21437/Interspeech.2023-2449",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13155"
}
Back to top