Thesis Details

Speech Enhancement with Cycle-Consistent Neural Networks

Master's Thesis Student: Karlík Pavol Academic Year: 2019/2020 Supervisor: Žmolíková Kateřina, Ing., Ph.D.
Czech title
Odstraňování šumu pomocí neuronových sítí s cyklickou konzistencí
Language
English
Abstract

Deep neural networks (DNNs) have become a standard approach for solving problems of speech enhancement (SE). The training process of a neural network can be extended by using a second neural network, which learns to insert noise into a clean speech signal. Those two networks can be used in combination with each other to reconstruct clean and noisy speech samples. This thesis focuses on utilizing this technique, called cycle-consistency. Cycle-consistency improves the robustness of a network without modifying the speech-enhancing neural network, as it exposes the SE network to a much larger variety of noisy data. However, this method requires input-target training data pairs, which are not always available. We use generative adversarial networks (GANs) with cycle-consistency constraint to train the network using unpaired data. We perform a large number of experiments using both paired and unpaired training data. Our results have shown that adding cycle-consistency improves the models' performance significantly.

Keywords

speech enhancement, GAN, generative adversarial networks, deep learning, cycle-consistency

Department
Degree Programme
Information Technology, Field of Study Information Systems
Files
Status
defended, grade A
Date
17 July 2020
Reviewer
Committee
Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), předseda
Bartík Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen
Chudý Peter, doc. Ing., Ph.D. MBA (DCGM FIT BUT), člen
Peringer Petr, Dr. Ing. (DITS FIT BUT), člen
Rychlý Marek, RNDr., Ph.D. (DIFS FIT BUT), člen
Veselý Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen
Citation
KARLÍK, Pavol. Speech Enhancement with Cycle-Consistent Neural Networks. Brno, 2020. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2020-07-17. Supervised by Žmolíková Kateřina. Available from: https://www.fit.vut.cz/study/thesis/23134/
BibTeX
@mastersthesis{FITMT23134,
    author = "Pavol Karl\'{i}k",
    type = "Master's thesis",
    title = "Speech Enhancement with Cycle-Consistent Neural Networks",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2020,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/23134/"
}
Back to top