Faculty of Information Technology, BUT

Publication Details

Language modeling of Czech using neural networks

MIKOLOV Tomáš. Language modeling of Czech using neural networks. In: Proc. 13th Conference STUDENT EEICT 2007. Brno: Faculty of Electrical Engineering and Communication BUT, 2007, pp. 1-3. ISBN 9788021434103.
Czech title
Jazykové modelování češtiny s využitím neuronových sítí
Type
conference paper
Language
english
Authors
Mikolov Tomáš, Ing. (DCGM FIT BUT)
URL
Keywords
language modeling
Abstract
The work concentrates on language modeling of Czech using neural networks
Annotation
Language models are used in many systems involving natural language processing, like speech and handwriting recognition. The most widely used techniques are based on backoff n-grams. However, it is commonly believed that this approach is insufficient. One of the best improvements over back-off language models has been achieved by using neural networks that project words onto a continuous space. This work concentrates on comparison of standard 4-gram language model with modified Kneser-Ney smoothing and neural network, both trained on spoken corpora with 1M words. Significant improvements in perplexity are reported.
Published
2007
Pages
1-3
Proceedings
Proc. 13th Conference STUDENT EEICT 2007
Conference
Student EEICT 2007, Brno, CZ
ISBN
9788021434103
Publisher
Faculty of Electrical Engineering and Communication BUT
Place
Brno, CZ
BibTeX
@INPROCEEDINGS{FITPUB8476,
   author = "Tom\'{a}\v{s} Mikolov",
   title = "Language modeling of Czech using neural networks",
   pages = "1--3",
   booktitle = "Proc. 13th Conference STUDENT EEICT 2007",
   year = 2007,
   location = "Brno, CZ",
   publisher = "Faculty of Electrical Engineering and Communication BUT",
   ISBN = "9788021434103",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/8476"
}
Back to top