Detail výsledku

Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units

YUSUF, B.; ČERNOCKÝ, J.; SARAÇLAR, M. Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. no. 9, p. 5068-5072. ISSN: 1990-9772.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Yusuf Bolaji, UPGM (FIT)
Černocký Jan, prof. Dr. Ing., UPGM (FIT)
SARAÇLAR, M.
Abstrakt

End-to-end (E2E) keyword search (KWS) has emerged as an
alternative and complimentary approach to conventional key-
word search which depends on the output of automatic speech
recognition (ASR) systems. While E2E methods greatly sim-
plify the KWS pipeline, they generally have worse performance
than their ASR-based counterparts, which can benefit from pretraining with untranscribed data. In this work, we propose a
method for pretraining E2E KWS systems with untranscribed
data, which involves using acoustic unit discovery (AUD) to
obtain discrete units for untranscribed data and then learning to
locate sequences of such units in the speech. We conduct exper-
iments across languages and AUD systems: we show that finetuning such a model significantly outperforms a model trained
from scratch, and the performance improvements are generally
correlated with the quality of the AUD system used for pretraining.

Klíčová slova

keyword search, spoken term detection, acoustic unit discovery

URL
Rok
2024
Strany
5068–5072
Časopis
Proceedings of Interspeech, roč. 2024, č. 9, ISSN 1990-9772
Sborník
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Konference
Interspeech Conference
Vydavatel
International Speech Communication Association
Místo
Kos
DOI
EID Scopus
BibTeX
@inproceedings{BUT193671,
  author="YUSUF, B. and ČERNOCKÝ, J. and SARAÇLAR, M.",
  title="Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units",
  booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="5068--5072",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-1713",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/yusuf24b_interspeech.pdf"
}
Soubory
Projekty
Multilingvální a mezikulturní interakce v dialogových systémech pro bezpečnostně kritické aplikace závislé na kontextu a kontrolou zaujatosti, EU, HORIZON EUROPE, zahájení: 2024-01-01, ukončení: 2026-12-31, řešení
Robustní zpracování nahrávek pro operativu a bezpečnost, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, zahájení: 2020-10-01, ukončení: 2025-09-30, ukončen
Výzkumné skupiny
Pracoviště
Nahoru