Thesis Details

Odezírání ze rtů pomocí hlubokých neuronových sítí

Bachelor's Thesis Student: Kadleček Josef Academic Year: 2018/2019 Supervisor: Hradiš Michal, Ing., Ph.D.

English title

Convolutional Networks for Lip Reading

Language

Czech

Abstract

This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC.

Keywords

Lip reading, speech recognition, neural networks, recurrent neural network, convolution, computer vision, sequence to sequence, Encoder-Decoder, CTC, PyTorch, Python.

Department

Department of Computer Graphics and Multimedia FIT BUT

Degree Programme

Information Technology

Files

Status

defended, grade B

Date

11 June 2019

Reviewer

Kišš Martin, Ing.

Committee

Herout Adam, prof. Ing., Ph.D. (DCGM FIT BUT), předseda
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Čadík Martin, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Křivka Zbyněk, Ing., Ph.D. (DIFS FIT BUT), člen
Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), člen

Citation

KADLEČEK, Josef. Odezírání ze rtů pomocí hlubokých neuronových sítí. Brno, 2019. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2019-06-11. Supervised by Hradiš Michal. Available from: https://www.fit.vut.cz/study/thesis/21772/

BibTeX

@bachelorsthesis{FITBT21772,
    author = "Josef Kadle\v{c}ek",
    type = "Bachelor's thesis",
    title = "Odez\'{i}r\'{a}n\'{i} ze rt\r{u} pomoc\'{i} hlubok\'{y}ch neuronov\'{y}ch s\'{i}t\'{i}",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2019,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/21772/"
}

Theses