Thesis Details

Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data

Master's Thesis Student: Kocour Martin Academic Year: 2018/2019 Supervisor: Černocký Jan, prof. Dr. Ing.

Czech title

Language

English

Abstract

Today's large vocabulary speech recognition systems are very accurate. However, tens or hundreds of hours of manually transcribed speech are needed in order to train such system. This kind of data is often unavailable, or they even do not exist for the desired language. A possible solution is to use commonly available but lower quality audiovisual data. This thesis addresses the methods of processing such data for semi-supervised training of acoustic models. Afterwards, it demonstrates how to continually improve already trained acoustic models by using these practically unlimited data. In this work is proposed a novel approach for selecting data based on similarity with the target domain.

Keywords

Large vocabulary continuous speech recognition, semi-supervised training, time delay neural network, subtitled speech data, acoustic modelling

Department

Department of Computer Graphics and Multimedia FIT BUT

Degree Programme

Information Technology, Field of Study Intelligent Systems

Files

Status

defended, grade A

Date

19 June 2019

Reviewer

Veselý Karel, Ing., Ph.D.

Committee

Zbořil František V., doc. Ing., CSc. (DITS FIT BUT), předseda
Beran Vítězslav, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Horák Aleš, doc. RNDr., Ph.D. (FI MUNI), člen
Hrubý Martin, Ing., Ph.D. (DITS FIT BUT), člen
Janoušek Vladimír, doc. Ing., Ph.D. (DITS FIT BUT), člen
Rozman Jaroslav, Ing., Ph.D. (DITS FIT BUT), člen

Citation

KOCOUR, Martin. Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data. Brno, 2019. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2019-06-19. Supervised by Černocký Jan. Available from: https://www.fit.vut.cz/study/thesis/22041/

BibTeX

@mastersthesis{FITMT22041,
    author = "Martin Kocour",
    type = "Master's thesis",
    title = "Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2019,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/22041/"
}

Theses