Result Details

Acoustic keyword spotter - optimization from end-user perspective

SZŐKE, I.; GRÉZL, F.; ČERNOCKÝ, J.; FAPŠO, M. Acoustic keyword spotter - optimization from end-user perspective. Proceedings of the 2010 IEEE Spoken Language Technology Workshop. IEEE Catalog Number: CFP 10SLT-USB. Berkeley, California: IEEE Signal Processing Society, 2010. p. 177-181. ISBN: 978-1-4244-7902-3.
Type
conference paper
Language
English
Authors
Szőke Igor, Ing., Ph.D., DCGM (FIT)
Grézl František, Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Fapšo Michal, Ing., Ph.D., DCGM (FIT)
Abstract

This paper is on acoustic keyword spotting. It presents several steps that have to be done to obtain a usable acoustic keyword spotting system. The novelty of the system is in the calibration.

Keywords

keyword spotting, spoken term detection, neural networks, calibration

URL
Annotation

The paper deals with the development of acoustic keyword spotter (KWS) meeting requirements of a real user from the security community. While the basic scheme of the KWS is relatively standard, it uses novel features derived by a hierarchy of neural networks, and score normalization trained to maximize a user-like evaluation metric. The results are reported on a selection of Czech conversational telephone speech (CTS), radio and read data.

Published
2010
Pages
177–181
Proceedings
Proceedings of the 2010 IEEE Spoken Language Technology Workshop
Series
IEEE Catalog Number: CFP 10SLT-USB
Conference
IEEE Workshop on Spoken Language Technology
ISBN
978-1-4244-7902-3
Publisher
IEEE Signal Processing Society
Place
Berkeley, California
BibTeX
@inproceedings{BUT35213,
  author="Igor {Szőke} and František {Grézl} and Jan {Černocký} and Michal {Fapšo}",
  title="Acoustic keyword spotter - optimization from end-user perspective",
  booktitle="Proceedings of the 2010 IEEE Spoken Language Technology Workshop",
  year="2010",
  series="IEEE Catalog Number: CFP 10SLT-USB",
  pages="177--181",
  publisher="IEEE Signal Processing Society",
  address="Berkeley, California",
  isbn="978-1-4244-7902-3",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2010/sz%f6ke_SLT2010_p.177.pdf"
}
Projects
Overcoming the language barrier complicating investigation into financing terrorism and serious financial crimes, MV, Program bezpečnostního výzkumu, VD20072010B16, start: 2007-08-01, end: 2010-12-31, completed
Recognition and presentation of multimedia data, BUT, Vnitřní projekty VUT, FIT-S-10-2, 2010, start: 2010-04-01, end: 2010-12-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Theory and applications of phoneme posterior estimation in speech processing, GACR, Doktorské granty, GP102/09/P635, start: 2009-01-01, end: 2011-12-31, completed
Research groups
Departments
Back to top