Thesis Details

Odhad obličeje z řečového signálu

Master's Thesis Student: Zubalík Petr Academic Year: 2021/2022 Supervisor: Plchot Oldřich, Ing., Ph.D.
English title
Learning the Face Behind a Voice
Language
Czech
Abstract

The main goal of this thesis is to design and implement a system that will be able to generate a face based on the speech of a given person. This problem is solved using a system composed of three convolutional neural network models. The first one is based on the ResNet architecture and is used to extract features from speech recordings. The second model is a fully convolutional neural network which converts the extracted features into the styles which form a base for the final facial image. These styles are then passed as an input to the StyleGAN generator, which creates the resulting face. The proposed system is implemented in the Python programming language using the PyTorch framework. The last chapter of the thesis discusses some of the most significant experiments performed to fine-tune and test the developed system.

Keywords

convolutional neural networks, ResNet, GAN, speech processing, artificial intelligence, generative adversarial networks, image processing, Python, PyTorch, face estimation, StyleGAN

Department
Degree Programme
Files
Status
defended, grade B
Date
21 June 2022
Reviewer
Committee
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT), předseda
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Češka Milan, doc. RNDr., Ph.D. (DITS FIT BUT), člen
Hradiš Michal, Ing., Ph.D. (DCGM FIT BUT), člen
Rozman Jaroslav, Ing., Ph.D. (DITS FIT BUT), člen
Zbořil František V., doc. Ing., CSc. (DITS FIT BUT), člen
Citation
ZUBALÍK, Petr. Odhad obličeje z řečového signálu. Brno, 2022. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2022-06-21. Supervised by Plchot Oldřich. Available from: https://www.fit.vut.cz/study/thesis/24862/
BibTeX
@mastersthesis{FITMT24862,
    author = "Petr Zubal\'{i}k",
    type = "Master's thesis",
    title = "Odhad obli\v{c}eje z \v{r}e\v{c}ov\'{e}ho sign\'{a}lu",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2022,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/24862/"
}
Back to top