Result Details
Hybrid word-subword decoding for spoken term detection
Fapšo Michal, Ing., Ph.D., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
The paper is hybrid word-subword decoding for spoken term detection
spoken term detection
This paper deals with a hybrid word-subword recognition system for spoken term detection. The decoding is driven by a hybrid recognition network and the decoder directly produces hybrid word-subword lattices. One phone and two multigram models were tested to represent sub-word units. The systems were evaluated in terms of spoken term detection accuracy and the size of index. We concluded that the best subword model for hybrid word-subword recognition is the multigram model trained on the word recognizer vocabulary. We achieved an improvement in word recognition accuracy, and in spoken term detection accuracy when in-vocabulary and out-of-vocabulary terms are searched separately. Spoken term detection accuracy with the full (in-vocabulary and out-of-vocabulary) term set was slightly worse but the required index size was significantly reduced.
@inproceedings{BUT32318,
author="Igor {Szőke} and Michal {Fapšo} and Lukáš {Burget} and Jan {Černocký}",
title="Hybrid word-subword decoding for spoken term detection",
booktitle="Proc. SSCS 2008: Speech search workshop at SIGIR",
year="2008",
pages="1--4",
publisher="Association for Computing Machinery",
address="Singapore",
isbn="978-90-365-2697-5",
url="http://www.fit.vutbr.cz/research/groups/speech/publi/2008/szoke_sigir2008.pdf"
}
Overcoming the language barrier complicating investigation into financing terrorism and serious financial crimes, MV, Program bezpečnostního výzkumu, VD20072010B16, start: 2007-08-01, end: 2010-12-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed