Result Details

Integrating recent MLP feature extraction techniques into TRAP architecture

GRÉZL, F.; KARAFIÁT, M. Integrating recent MLP feature extraction techniques into TRAP architecture. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. no. 8, p. 1229-1232. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.
Type
conference paper
Language
English
Authors
Abstract

The article shows the performance improvement and limitations of TRAP and HATS neural net systems for feature extraction in LVCSR when the bottel neck approach and phoneme states targets are introduced into them.

Keywords

TRAP processing, Bottle-Neck technique, subphonemeclasses, LVCSR features

URL
Annotation

This paper is focused on the incorporation of recent techniques for multi-layer perceptron (MLP) based feature extraction in Temporal Pattern (TRAP) and Hidden Activation TRAP (HATS) feature extraction scheme. The TRAP scheme has been origin of various MLP-based features some of which are now indivisible part of state-of-the-art LVCSR systems. The modifications which brought most improvement - sub-phoneme targets and Bottle-Neck technique - are introduced into original TRAP scheme. Introduction of sub-phoneme targets uncovered the hidden danger of having too many classes in TRAP/HATS scheme. On the other hand, Bottle-Neck technique improved the TRAP/HATS scheme so its competitive with other approaches.

Published
2011
Pages
1229–1232
Journal
Proceedings of Interspeech, vol. 2011, no. 8, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2011
Conference
Interspeech Conference
ISBN
978-1-61839-270-1
Publisher
International Speech Communication Association
Place
Florence
BibTeX
@inproceedings{BUT76438,
  author="František {Grézl} and Martin {Karafiát}",
  title="Integrating recent MLP feature extraction techniques into TRAP architecture",
  booktitle="Proceedings of Interspeech 2011",
  year="2011",
  journal="Proceedings of Interspeech",
  volume="2011",
  number="8",
  pages="1229--1232",
  publisher="International Speech Communication Association",
  address="Florence",
  isbn="978-1-61839-270-1",
  issn="1990-9772",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/grezl_interspeech2011_204.pdf"
}
Projects
Advanced recognition and presentation of multimedia data, BUT, Vnitřní projekty VUT, FIT-S-11-2, start: 2011-01-01, end: 2013-12-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Technologies of speech processing for efficient human-machine communication, TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA01011328, start: 2011-01-01, end: 2014-12-31, completed
Theory and applications of phoneme posterior estimation in speech processing, GACR, Doktorské granty, GP102/09/P635, start: 2009-01-01, end: 2011-12-31, completed
Research groups
Departments
Back to top