Result Details

Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer

MOŠNER, L.; PLCHOT, O.; BURGET, L.; ČERNOCKÝ, J. Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022. p. 7982-7986. ISBN: 978-1-6654-0540-9.
Type
conference paper
Language
English
Authors
Abstract

We focus on the problem of speaker recognition in far-field multichannel data. The main contribution is introducing an alternative way of predicting spatial covariance matrices (SCMs) for a beamformer from the time domain signal. We propose to use ConvTasNet, a well-known source separation model, and we adapt it to perform speech enhancement by forcing it to separate speech and additive noise. We experiment with using the STFT of Conv-TasNet outputs to obtain SCMs of speech and noise, and finally, we fine-tune this multi-channel frontend w.r.t. speaker verification objective. We successfully tackle the problem of the lack of a realistic multichannel training set by using simulated data of MultiSV corpus. The analysis is performed on its retransmitted and simulated test parts. We achieve consistent improvements with a 2.7 times smaller model than the baseline based on a scheme with mask estimating NN.

Keywords

Conv-TasNet, beamforming, embedding extractor, speaker verification, MultiSV

URL
Published
2022
Pages
7982–7986
Proceedings
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
ISBN
978-1-6654-0540-9
Publisher
IEEE Signal Processing Society
Place
Singapore
DOI
UT WoS
000864187908058
EID Scopus
BibTeX
@inproceedings{BUT178381,
  author="Ladislav {Mošner} and Oldřich {Plchot} and Lukáš {Burget} and Jan {Černocký}",
  title="Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer",
  booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
  year="2022",
  pages="7982--7986",
  publisher="IEEE Signal Processing Society",
  address="Singapore",
  doi="10.1109/ICASSP43922.2022.9747771",
  isbn="978-1-6654-0540-9",
  url="https://ieeexplore.ieee.org/document/9747771"
}
Files
Projects
Exchanges for SPEech ReseArch aNd TechnOlogies, EU, Horizon 2020, start: 2021-01-01, end: 2025-12-31, running
Multi-linguality in speech technologies, MŠMT, INTER-EXCELLENCE - Podprogram INTER-ACTION, LTAIN19087, start: 2020-01-01, end: 2023-08-31, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, completed
Research groups
Departments
Back to top