Result Details

Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering

RANGAPPA, P.; CAROFILIS, A.; PRAKASH, J.; KUMAR, S.; BURDISSO, S.; MADIKERI, S.; VILLATORO-TELLO, E.; SHARMA, B.; MOTLÍČEK, P.; HACIOGLU, K.; VENKATESAN, S.; VYAS, S.; STOLCKE, A. Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering. In Interspeech. Interspeech. Rotterdam, The Netherlands: Isca-Int Speech Communication Assoc, 2025. p. 4928-4932.
Type
conference paper
Language
English
Authors
Rangappa Pradeep
Carofilis Andres
Prakash Jeena
Kumar Shashi
Burdisso Sergio
Madikeri Srikanth
Villatoro-Tello Esau
Sharma Bidisha
Motlíček Petr, doc. Ing., Ph.D., DCGM (FIT)
Hacioglu Kadri
Venkatesan Shankar
Vyas Saurabh
Stolcke Andreas
Abstract

Fine-tuning pretrained ASR models for specific domains is challenging for small organizations with limited labeled data and computational resources. Here we explore different data selection pipelines and propose a robust approach that improves ASR adaptation by filtering pseudo-labels generated using Whisper (encoder-decoder) and Zipformer (transducer) models. Our approach integrates multiple selection strategies-including word error rate (WER) prediction, named entity recognition (NER), and character error rate (CER) analysis-to extract high-quality training segments. We evaluate our method on Whisper and Zipformer using a 7500-hour baseline, comparing it to a CER-based approach relying on hypotheses from three ASR systems. Fine-tuning on 7500 hours of pseudo-labeled call center data achieves 12.3% WER, while our filtering reduces the dataset to 100 hours (1.4%) with similar performance; a similar trend is observed on Fisher English.

Keywords

speech recognition, data selection, whisper, zip-formers

URL
Published
2025
Pages
4928–4932
Journal
Interspeech, ISSN
Proceedings
Interspeech
Conference
Interspeech Conference
Publisher
Isca-Int Speech Communication Assoc
Place
Rotterdam, The Netherlands
DOI
UT WoS
001613931400410
EID Scopus
BibTeX
@inproceedings{BUT201433,
  author="{} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and Petr {Motlíček} and  {} and  {} and  {} and  {}",
  title="Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering",
  booktitle="Interspeech",
  year="2025",
  journal="Interspeech",
  pages="4928--4932",
  publisher="Isca-Int Speech Communication Assoc",
  address="Rotterdam, The Netherlands",
  doi="10.21437/Interspeech.2025-2580",
  url="https://www.fit.vut.cz/research/group/speech/public/publi/2025/rangappa_INTERSPEECH_2025_co-author_Motlicek.pdf"
}
Files
Projects
Research groups
Departments
Back to top