Faculty of Information Technology, BUT

Publication Details

Similarity Scoring for Recognizing Repeated Out-of-VocabularyWords

HANNEMANN Mirko, KOMBRINK Stefan, KARAFIÁT Martin and BURGET Lukáš. Similarity Scoring for Recognizing Repeated Out-of-VocabularyWords. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Makuhari, Chiba: International Speech Communication Association, 2010, pp. 897-900. ISBN 978-1-61782-123-3. ISSN 1990-9772.
Czech title
Skórování podobnosti pro rozpoznávání opakovaných výskytů slov mimo slovník
Type
conference paper
Language
english
Authors
URL
Keywords
out-of-vocabulary, OOV, hybrid word/sub-word recognizer, similarity measure, alignment error model
Abstract
This paper is on development of a similarity measure to detect repeatedly occuring Out-of-Vocabulary words (OOV), because they carry an important information.
Annotation
We develop a similarity measure to detect repeatedly occurring Out-of-Vocabulary words (OOV), since these carry important information. Sub-word sequences in the recognition output from a hybrid word/sub-word recognizer are taken as detected OOVs and are aligned to each other with the help of an alignment error model. This model is able to deal with partial OOV detections and tries to reveal more complex word relations such as compound words. We apply the model to a selection of conversational phone calls to retrieve other examples of the same OOV, and to obtain a higher-level description of it such as being a derivation of a known word.
Published
2010
Pages
897-900
Journal
Proceedings of Interspeech, vol. 2010, no. 9, ISSN 1990-9772
Proceedings
Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010)
Conference
Interspeech 2010, Tokyo, JP
ISBN
978-1-61782-123-3
Publisher
International Speech Communication Association
Place
Makuhari, Chiba, JP
BibTeX
@INPROCEEDINGS{FITPUB9358,
   author = "Mirko Hannemann and Stefan Kombrink and Martin Karafi\'{a}t and Luk\'{a}\v{s} Burget",
   title = "Similarity Scoring for Recognizing Repeated Out-of-VocabularyWords",
   pages = "897--900",
   booktitle = "Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010)",
   journal = "Proceedings of Interspeech",
   volume = 2010,
   number = 9,
   year = 2010,
   location = "Makuhari, Chiba, JP",
   publisher = "International Speech Communication Association",
   ISBN = "978-1-61782-123-3",
   ISSN = "1990-9772",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9358"
}
Back to top