Thesis Details

Odezírání ze rtů pomocí hlubokých neuronových sítí

Bachelor's Thesis Student: Kadleček Josef Academic Year: 2018/2019 Supervisor: Hradiš Michal, Ing., Ph.D.
English title
Convolutional Networks for Lip Reading
Language
Czech
Abstract

This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC. 

Keywords

Lip reading, speech recognition, neural networks, recurrent neural network, convolution, computer vision, sequence to sequence, Encoder-Decoder, CTC, PyTorch, Python.

Department
Degree Programme
Information Technology
Files
Status
defended, grade B
Date
11 June 2019
Reviewer
Committee
Herout Adam, prof. Ing., Ph.D. (DCGM FIT BUT), předseda
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Čadík Martin, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Křivka Zbyněk, Ing., Ph.D. (DIFS FIT BUT), člen
Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), člen
Citation
KADLEČEK, Josef. Odezírání ze rtů pomocí hlubokých neuronových sítí. Brno, 2019. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2019-06-11. Supervised by Hradiš Michal. Available from: https://www.fit.vut.cz/study/thesis/21772/
BibTeX
@bachelorsthesis{FITBT21772,
    author = "Josef Kadle\v{c}ek",
    type = "Bachelor's thesis",
    title = "Odez\'{i}r\'{a}n\'{i} ze rt\r{u} pomoc\'{i} hlubok\'{y}ch neuronov\'{y}ch s\'{i}t\'{i}",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2019,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/21772/"
}
Back to top