Result Details

Region Dependent Linear Transforms in Multilingual Speech Recognition

KARAFIÁT, M.; JANDA, M.; ČERNOCKÝ, J.; BURGET, L. Region Dependent Linear Transforms in Multilingual Speech Recognition. In Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012. p. 4885-4888. ISBN: 978-1-4673-0044-5.

Type

conference paper

Language

English

Authors

Karafiát Martin, Ing., Ph.D., DCGM (FIT)
Janda Miloš, Ing., FIT (FIT), DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)

Abstract

In today's speech recognition systems, linear or nonlinear transformationsare usually applied to post-process speech features forminginput to HMM based acoustic models. In this work, we experimentwith three popular transforms: HLDA,MPE-HLDA and Region DependentLinear Transforms (RDLT), which are trained jointly withthe acoustic model to extract maximum of the discriminative informationfrom the raw features and to represent it in a form suitablefor the following GMM-HMM based acoustic model. We focus onmulti-lingual environments, where limited resources are availablefor training recognizers of many languages. Using data from GlobalPhonedatabase, we show that, under such restrictive conditions,the feature transformations can be advantageously shared across languagesand robustly trained using data from several languages.

Keywords

HLDA, Region Dependent Transforms, MinimumPhone Error, fMPE, multilingual speech recognition

URL

https://www.fit.vut.cz/research/group/speech/public/publi/2012/karafiat…

Published

2012

Pages

4885–4888

Proceedings

Proc. International Conference on Acoustics, Speech, and Signal Processing 2012

Conference

The 37th International Conference on Acoustics, Speech, and Signal Processing

ISBN

978-1-4673-0044-5

Publisher

IEEE Signal Processing Society

Place

Kyoto

DOI

10.1109/ICASSP.2012.6289014

UT WoS

000312381404239

BibTeX

@inproceedings{BUT91480,
  author="Martin {Karafiát} and Miloš {Janda} and Jan {Černocký} and Lukáš {Burget}",
  title="Region Dependent Linear Transforms in Multilingual Speech Recognition",
  booktitle="Proc. International Conference on Acoustics, Speech, and Signal Processing 2012",
  year="2012",
  pages="4885--4888",
  publisher="IEEE Signal Processing Society",
  address="Kyoto",
  doi="10.1109/ICASSP.2012.6289014",
  isbn="978-1-4673-0044-5",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2012/karafiat_icassp2012_0004885.pdf"
}

Projects

Multilingual recognition and search in speech for electronic dictionaries, MPO, TIP, FR-TI1/034, start: 2009-09-01, end: 2013-08-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech recognition for low-resource languages, GACR, Postdoktorandské granty, GPP202/12/P604, start: 2012-01-01, end: 2014-12-31, completed
Technologies of speech processing for efficient human-machine communication, TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA01011328, start: 2011-01-01, end: 2014-12-31, completed

Research groups

Výzkumná skupina dolování dat z řeči BUT Speech@FIT (RG SPEECH)

Departments

Ústav počítačové grafiky a multimédií (DCGM)