Publication Details

Automatic Language Identification System

ČERNOCKÝ Jan, MATĚJKA Pavel, BURGET Lukáš and SCHWARZ Petr. Automatic Language Identification System. In: Sborník příspěvků z odborného semináře "Nové technologie v radiokomunikacích". Brno: University of Defence in Brno, 2006, pp. 1-6.
Type
conference paper
Language
english
Authors
Černocký Jan, doc. Dr. Ing. (DCGM FIT BUT)
Matějka Pavel, Ing. (UREL FEEC BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT)
URL
Keywords

speech processing, automatic language identification

Abstract

This paper presents the language identification (LID) system developed in Speech@FIT. The system consists of two parts: Acoustic LID determines the language directly on the basis of features derived from the speech signal. We have improved existing approaches by adding discriminative training of acoustic models. In phonotactic LID, speech is first transcribed by phoneme recognizer into strings or graphs (lattices) of phonemes. On these, language models are trained to capture statistics of sequences of phonemes. We have pioneered the use of so called îanti-modelsî for this task. All experimental results are reported on standard NIST 2003 data; comparison with other published results is favorable to our system.

Annotation

This paper presents the language identification (LID) system developed in Speech@FIT. The system consists of two parts: Acoustic LID determines the language directly on the basis of features derived from the speech signal. We have improved existing approaches by adding discriminative training of acoustic models. In phonotactic LID, speech is first transcribed by phoneme recognizer into strings or graphs (lattices) of phonemes. On these, language models are trained to capture statistics of sequences of phonemes. We have pioneered the use of so called îanti-modelsî for this task. All experimental results are reported on standard NIST 2003 data; comparison with other published results is favorable to our system.

Published
2006
Pages
1-6
Proceedings
Sborník příspěvků z odborného semináře "Nové technologie v radiokomunikacích"
Conference
Odborný seminář "Nové technologie v radiokomunikacích", Brno, CZ
Publisher
University of Defence in Brno
Place
Brno, CZ
BibTeX
@INPROCEEDINGS{FITPUB8223,
   author = "Jan \v{C}ernock\'{y} and Pavel Mat\v{e}jka and Luk\'{a}\v{s} Burget and Petr Schwarz",
   title = "Automatic Language Identification System",
   pages = "1--6",
   booktitle = "Sborn\'{i}k p\v{r}\'{i}sp\v{e}vk\r{u} z odborn\'{e}ho semin\'{a}\v{r}e {"}Nov\'{e} technologie v radiokomunikac\'{i}ch{"}",
   year = 2006,
   location = "Brno, CZ",
   publisher = "University of Defence in Brno",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/8223"
}
Back to top