Thesis Details

Speech Enhancement with Cycle-Consistent Neural Networks

Master's Thesis Student: Karlík Pavol Academic Year: 2019/2020 Supervisor: Žmolíková Kateřina, Ing., Ph.D.

Czech title

Odstraňování šumu pomocí neuronových sítí s cyklickou konzistencí

Language

English

Abstract

Deep neural networks (DNNs) have become a standard approach for solving problems of speech enhancement (SE). The training process of a neural network can be extended by using a second neural network, which learns to insert noise into a clean speech signal. Those two networks can be used in combination with each other to reconstruct clean and noisy speech samples. This thesis focuses on utilizing this technique, called cycle-consistency. Cycle-consistency improves the robustness of a network without modifying the speech-enhancing neural network, as it exposes the SE network to a much larger variety of noisy data. However, this method requires input-target training data pairs, which are not always available. We use generative adversarial networks (GANs) with cycle-consistency constraint to train the network using unpaired data. We perform a large number of experiments using both paired and unpaired training data. Our results have shown that adding cycle-consistency improves the models' performance significantly.

Keywords

speech enhancement, GAN, generative adversarial networks, deep learning, cycle-consistency

Department

Department of Computer Graphics and Multimedia FIT BUT

Degree Programme

Information Technology, Field of Study Information Systems

Files

Status

defended, grade A

Date

17 July 2020

Reviewer

Černocký Jan, prof. Dr. Ing.

Committee

Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), předseda
Bartík Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen
Chudý Peter, doc. Ing., Ph.D. MBA (DCGM FIT BUT), člen
Peringer Petr, Dr. Ing. (DITS FIT BUT), člen
Rychlý Marek, RNDr., Ph.D. (DIFS FIT BUT), člen
Veselý Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen

Citation

KARLÍK, Pavol. Speech Enhancement with Cycle-Consistent Neural Networks. Brno, 2020. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2020-07-17. Supervised by Žmolíková Kateřina. Available from: https://www.fit.vut.cz/study/thesis/23134/

BibTeX

@mastersthesis{FITMT23134,
    author = "Pavol Karl\'{i}k",
    type = "Master's thesis",
    title = "Speech Enhancement with Cycle-Consistent Neural Networks",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2020,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/23134/"
}

Theses