Thesis Details

Interpretability of Neural Networks in Speech Processing

Bachelor's Thesis Student: Sarvaš Marek Academic Year: 2020/2021 Supervisor: Žmolíková Kateřina, Ing., Ph.D.
Czech title
Interpretace neuronových sítí ve zpracování řeči
Language
English
Abstract

With the growing popularity of deep neural networks, the lack of transparency caused by their black box representation is raising demand for their interpretability. The goal of this thesis is to gain new insights into deep neural networks in speech processing tasks. Specifically, gender classification task on AudioMNIST dataset and speaker classification task on filterbanks from VoxCeleb dataset using convolutional and residual neural network. Layer-wise relevance propagation was used for the interpretation of these neural networks. This method produced heatmaps highlighting features that contributed positively and negatively to the correct classification. As results of interpretation show, classifications were mainly based on lower frequencies in time. In the case of gender classification, I managed to find the model's high dependency on a small number of features. Using obtained information, I created an augmented training set that increased the model's robustness.

Keywords

deep neural networks, convolutional neural networks, speech processing, interpretation of neural networks, Layer-Wise Relevance Propagation

Department
Degree Programme
Information Technology
Files
Status
defended, grade A
Date
16 June 2021
Reviewer
Committee
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT), předseda
Češka Milan, doc. RNDr., Ph.D. (DITS FIT BUT), člen
Jaroš Jiří, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Orság Filip, Ing., Ph.D. (DITS FIT BUT), člen
Rychlý Marek, RNDr., Ph.D. (DIFS FIT BUT), člen
Citation
SARVAŠ, Marek. Interpretability of Neural Networks in Speech Processing. Brno, 2021. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2021-06-16. Supervised by Žmolíková Kateřina. Available from: https://www.fit.vut.cz/study/thesis/24073/
BibTeX
@bachelorsthesis{FITBT24073,
    author = "Marek Sarva\v{s}",
    type = "Bachelor's thesis",
    title = "Interpretability of Neural Networks in Speech Processing",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2021,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/24073/"
}
Back to top