Detail výsledku

Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery

ONDEL YANG, L.; YUSUF, B.; BURGET, L.; SARAÇLAR, M. Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery. IEEE-ACM Transactions on Audio Speech and Language Processing, 2022, vol. 30, no. 5, p. 1902-1917. ISSN: 2329-9290.
Typ
článek v časopise
Jazyk
anglicky
Autoři
ONDEL YANG, L.
Yusuf Bolaji, UPGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., UPGM (FIT)
SARAÇLAR, M.
Abstrakt

This work investigates subspace non-parametricmodels for the task of learning a set of acoustic units fromunlabeledspeech recordings. We constrain the base-measure of a Dirichlet-Process mixture with a phonetic subspaceestimated from othersource languagesto build an educated prior, thereby forcing thelearned acoustic units to resemble phones of known source languages.Two types of models are proposed: (i) the Subspace HMM(SHMM) which assumes that the phonetic subspace is the same forevery language, (ii) the Hierarchical-Subspace HMM (H-SHMM)which relaxes this assumption and allows to have a languagespecificsubspace estimated on the unlabeled target data. Thesemodels are applied on 3 languages: English, Yoruba and Mboshiand they are compared with various competitive acoustic unitsdiscovery baselines. Experimental results show that both subspacemodels outperform other systems in terms of clustering quality andsegmentation accuracy. Moreover, we observe that the H-SHMMprovides results superior to the SHMM supporting the idea thatlanguage-specific priors are preferable to language-agnostic priorsfor acoustic unit discovery.

Klíčová slova

Unsupervised learning, non- parametricBayesian models, acoustic unit discovery

URL
Rok
2022
Strany
1902–1917
Časopis
IEEE-ACM Transactions on Audio Speech and Language Processing, roč. 30, č. 5, ISSN 2329-9290
DOI
UT WoS
000811572000001
EID Scopus
BibTeX
@article{BUT178412,
  author="ONDEL YANG, L. and YUSUF, B. and BURGET, L. and SARAÇLAR, M.",
  title="Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery",
  journal="IEEE-ACM Transactions on Audio Speech and Language Processing",
  year="2022",
  volume="30",
  number="5",
  pages="1902--1917",
  doi="10.1109/TASLP.2022.3171975",
  issn="2329-9290",
  url="https://ieeexplore.ieee.org/document/9767690"
}
Soubory
Projekty
Neuronové reprezentace v multimodálním a mnohojazyčném modelování, GAČR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, zahájení: 2019-01-01, ukončení: 2023-12-31, ukončen
Vícenásobné služby inteligentního konverzačního agenta pro přijetí, řízení a integraci občanů třetích zemí v EU, EU, Horizon 2020, zahájení: 2020-02-01, ukončení: 2023-04-30, ukončen
Výzkumné skupiny
Pracoviště
Nahoru