Thesis Details

STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS

Ph.D. Thesis Student: Mikolov Tomáš Academic Year: 2012/2013 Supervisor: Černocký Jan, prof. Dr. Ing.

Czech title

Statistické jazykové modely založené na neuronových sítích

Language

English

Abstract

Statistical language models are crucial part of many successful applications, such as automatic speech recognition and statistical machine translation (for example well-known Google Translate). Traditional techniques for estimating these models are based on Ngram counts. Despite known weaknesses of N-grams and huge efforts of research communities across many fields (speech recognition, machine translation, neuroscience, artificial intelligence, natural language processing, data compression, psychology etc.), N-grams remained basically the state-of-the-art. The goal of this thesis is to present various architectures of language models that are based on artificial neural networks. Although these models are computationally more expensive than N-gram models, with the presented techniques it is possible to apply them to state-of-the-art systems efficiently. Achieved reductions of word error rate of speech recognition systems are up to 20%, against stateof-the-art N-gram model. The presented recurrent neural network based model achieves the best published performance on well-known Penn Treebank setup.

Keywords

language model, neural network, recurrent, maximum entropy, speech recognition, data compression, artificial intelligence

Department

Department of Computer Graphics and Multimedia FIT BUT

Degree Programme

Computer Science and Engineering, Field of Study Computer Science and Engineering

Files

Status

defended

Date

2 October 2012

Citation

MIKOLOV, Tomáš. STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS. Brno, 2012. Ph.D. Thesis. Brno University of Technology, Faculty of Information Technology. 2012-10-02. Supervised by Černocký Jan. Available from: https://www.fit.vut.cz/study/phd-thesis/283/

BibTeX

@phdthesis{FITPT283,
    author = "Tom\'{a}\v{s} Mikolov",
    type = "Ph.D. thesis",
    title = "STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2012,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/phd-thesis/283/"
}

Theses