Thesis Details

Nízko-dimenzionální faktorizace pro "End-To-End" řečové systémy

Master's Thesis Student: Gajdár Matúš Academic Year: 2019/2020 Supervisor: Karafiát Martin, Ing., Ph.D.
Language
Slovak
Abstract

The project covers automatic speech recognition with neural network training using low-dimensional matrix factorization. We are describing time delay neural networks with factorization (TDNN-F) and without it (TDNN) in Pytorch language. We are comparing the implementation between Pytorch and Kaldi toolkit, where we achieve similar results during experiments with various network architectures. The last chapter describes the impact of a low-dimensional matrix factorization on End-to-End speech recognition systems and also a modification of the system with TDNN(-F) networks. Using specific network settings, we were able to achieve better results with systems using factorization. Additionally, we reduced the complexity of training by decreasing network parameters with the use of TDNN(-F) networks.

Keywords

Automatic speech recognition, convolution neural networks, TDNN, low-dimensional matrix factorization, E2E, TDNN-F, Pytorch, Kaldi, ESPnet

Department
Degree Programme
Information Technology, Field of Study Computer Graphics and Multimedia
Files
Status
defended, grade B
Date
15 July 2020
Reviewer
Committee
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT), předseda
Bařina David, Ing., Ph.D. (DCGM FIT BUT), člen
Beran Vítězslav, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Grézl František, Ing., Ph.D. (DCGM FIT BUT), člen
Herout Adam, prof. Ing., Ph.D. (DCGM FIT BUT), člen
Křivka Zbyněk, Ing., Ph.D. (DIFS FIT BUT), člen
Citation
GAJDÁR, Matúš. Nízko-dimenzionální faktorizace pro "End-To-End" řečové systémy. Brno, 2020. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2020-07-15. Supervised by Karafiát Martin. Available from: https://www.fit.vut.cz/study/thesis/23195/
BibTeX
@mastersthesis{FITMT23195,
    author = "Mat\'{u}\v{s} Gajd\'{a}r",
    type = "Master's thesis",
    title = "N\'{i}zko-dimenzion\'{a}ln\'{i} faktorizace pro {"}End-To-End{"} \v{r}e\v{c}ov\'{e} syst\'{e}my",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2020,
    location = "Brno, CZ",
    language = "slovak",
    url = "https://www.fit.vut.cz/study/thesis/23195/"
}
Back to top