Result Details

A factorized representation of FMLLR transform based on QR-decomposition

RATH, S.; KARAFIÁT, M.; GLEMBEK, O.; ČERNOCKÝ, J. A factorized representation of FMLLR transform based on QR-decomposition. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. no. 9, p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772.
Type
conference paper
Language
English
Authors
Rath Shakti Prasad, DCGM (FIT)
Karafiát Martin, Ing., Ph.D., DCGM (FIT)
Glembek Ondřej, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Abstract

This paper describes a new factorized representation of FMLLR transform, which is based on QR-decomposition.

Keywords

FMLLR, QR Decomposition, Orthogonal Matrix,Givens Rotation, Upper Triangular Matrix

URL
Annotation

In this paper, we propose a novel representation of the FMLLR transform. This is different from the standard FMLLR in that the linear transform (LT) is expressed in a factorized form such that each of the factors involves only one parameter. The representation is mainly motivated by QR-decomposition of a square matrix and hence is referred to as QR-FMLLR. The mathematical expressions and steps for maximum likelihood (ML) estimation of the parameters are presented. The ML estimation of QR-FMLLR does not require the use of numerical technique, such as gradient ascent, and it does not involve matrix inversion and computation of matrix determinant. On an LVCSR task, we show the performance of QR-FMLLR to be comparable to the standard FMLLR. We conjecture that QR-FMLLR is amenable to speaker adaptation using data that varies from very short to large and present a brief discussion on how this can be achieved.

Published
2012
Pages
1–4
Journal
Proceedings of Interspeech, vol. 2012, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2012
Conference
Interspeech Conference
ISBN
978-1-62276-759-5
Publisher
International Speech Communication Association
Place
Portland, Oregon
BibTeX
@inproceedings{BUT97011,
  author="Shakti Prasad {Rath} and Martin {Karafiát} and Ondřej {Glembek} and Jan {Černocký}",
  title="A factorized representation of FMLLR transform based on QR-decomposition",
  booktitle="Proceedings of Interspeech 2012",
  year="2012",
  journal="Proceedings of Interspeech",
  volume="2012",
  number="9",
  pages="1--4",
  publisher="International Speech Communication Association",
  address="Portland, Oregon",
  isbn="978-1-62276-759-5",
  issn="1990-9772",
  url="http://www.isca-speech.org/archive/interspeech_2012/i12_0551.html"
}
Projects
Centrum excelence IT4Innovations, MŠMT, Operační program Výzkum a vývoj pro inovace, ED1.1.00/02.0070, start: 2011-01-01, end: 2015-12-31, completed
Discriminative training of speaker-normalized models for automatic speech recognition, EU, Seventh Research Framework Programme, SIGA890, start: 2011-01-07, end: 2013-01-07, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Technologies of speech processing for efficient human-machine communication, TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA01011328, start: 2011-01-01, end: 2014-12-31, completed
Research groups
Departments
Back to top