Publication Details

Automatic Language Identification using Phoneme and Automatically Derived Unit Strings

MATĚJKA Pavel, SZŐKE Igor, SCHWARZ Petr and ČERNOCKÝ Jan. Automatic Language Identification using Phoneme and Automatically Derived Unit Strings. Lecture Notes in Computer Science, vol. 2004, no. 3206, p. 8. ISSN 0302-9743.
Czech title
Automatická Identifikace Jazyka užitím Fonémů a Automaticky Odvozených Jednotek
Type
journal article
Language
english
Authors
Matějka Pavel, Ing. (UREL FEEC BUT)
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT)
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, doc. Dr. Ing. (DCGM FIT BUT)
URL
Abstract

Phonemes and Automatically Derived Units in Automatic Language Identification

Annotation

Language identification (LID) based on phono-tactic modeling is presented in this paper. Approaches using phoneme strings and strings of units automatically derived by  an Ergodic HMM (EHMM)  are compared. The  phoneme recognizers were trained on 6  languages from OGI multi-language-corpus and Czech SpeechDat-E. The LID results are obtained on 4 languages. The results show superiority of Czech phoneme recognizer while used in LID and promising trends using
the EHMM-derived units.

Published
2004
Pages
8
Journal
Lecture Notes in Computer Science, vol. 2004, no. 3206, ISSN 0302-9743
Book
Lecture Notes in Computer Science
Publisher
Springer Verlag
BibTeX
@ARTICLE{FITPUB7642,
   author = "Pavel Mat\v{e}jka and Igor Sz\H{o}ke and Petr Schwarz and Jan \v{C}ernock\'{y}",
   title = "Automatic  Language Identification using Phoneme and Automatically Derived Unit Strings",
   pages = 8,
   booktitle = "Lecture Notes in Computer Science",
   journal = "Lecture Notes in Computer Science",
   volume = 2004,
   number = 3206,
   year = 2004,
   ISSN = "0302-9743",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/7642"
}
Back to top