Publication Details
Source Separation for Sound Event Detection in domestic environments using jointly trained models
Žmolíková Kateřina, Ing., Ph.D. (DCGM FIT BUT)
Torre Toledano Doroteo (UAM)
Sound Event Detection, Source Separation, DCASE, DESED
Sound Event Detection and Source Separation are closely related tasks: whereas the first aims to find the time boundaries of acoustic events inside a recording, the goal of the latter is to isolate each of the acoustic sources into different signals. This paper presents a Sound Event Detection system formed by two independently pretrained blocks for Source Separation and Sound Event Detection. We propose a joint-training scheme, where both blocks are trained at the same time, and a two-stage training, where each block trains while the other one is frozen. In addition, we compare the use of supervised and unsupervised pre-training for the Separation block, and two model selection strategies for Sound Event Detection. Our experiments show that the proposed methods are able to outperform the baseline systems of the DCASE 2021 Challenge Task 4.
@INPROCEEDINGS{FITPUB12857, author = "Diego Gorron Benito de and Kate\v{r}ina \v{Z}mol\'{i}kov\'{a} and Doroteo Toledano Torre", title = "Source Separation for Sound Event Detection in domestic environments using jointly trained models", pages = "1--5", booktitle = "Proceedings of The 17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)", year = 2022, location = "Bamberg, DE", publisher = "IEEE Signal Processing Society", ISBN = "978-1-6654-6867-1", doi = "10.1109/IWAENC53105.2022.9914755", language = "english", url = "https://www.fit.vut.cz/research/publication/12857" }