Detail výsledku

Multi-Channel Speech Separation with Cross-Attention and Beamforming

MOŠNER, L.; PLCHOT, O.; PENG, J.; BURGET, L.; ČERNOCKÝ, J. Multi-Channel Speech Separation with Cross-Attention and Beamforming. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Proceedings of Interspeech. Dublin: International Speech Communication Association, 2023. no. 08, p. 1693-1697. ISSN: 1990-9772.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Abstrakt

Originally, single-channel source separation gained more
research interest. It resulted in immense progress. Multichannel
(MC) separation comes with new challenges posed by
adverse indoor conditions making it an important field of study.
We seek to combine promising ideas from the two worlds.
First, we build MC models by extending current single-channel
time-domain separators relying on their strength. Our approach
allows reusing pre-trained models by inserting designed
lightweight reference channel attention (RCA) combiner, the
only trained module. It comprises two blocks: the former allows
attending to different parts of other channels w.r.t. the reference
one, and the latter provides an attention-based combination of
channels. Second, like many successful MC models, our system
incorporates beamforming and allows for the fusion of the network
and beamformer outputs. We compare our approach with
the SOTA models on the SMS-WSJ dataset and show better or
similar performance.

Klíčová slova

multi-channel source separation, cross-channel attention, beamforming

URL
Rok
2023
Strany
1693–1697
Časopis
Proceedings of Interspeech, roč. 2023, č. 08, ISSN 1990-9772
Sborník
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Konference
Interspeech Conference
Vydavatel
International Speech Communication Association
Místo
Dublin
DOI
EID Scopus
BibTeX
@inproceedings{BUT185571,
  author="Ladislav {Mošner} and Oldřich {Plchot} and Junyi {Peng} and Lukáš {Burget} and Jan {Černocký}",
  title="Multi-Channel Speech Separation with Cross-Attention and Beamforming",
  booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
  year="2023",
  journal="Proceedings of Interspeech",
  volume="2023",
  number="08",
  pages="1693--1697",
  publisher="International Speech Communication Association",
  address="Dublin",
  doi="10.21437/Interspeech.2023-2537",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/interspeech_2023/mosner23_interspeech.html"
}
Soubory
Projekty
Multi-lingualita v řečových technologiích, MŠMT, INTER-EXCELLENCE - Podprogram INTER-ACTION, LTAIN19087, zahájení: 2020-01-01, ukončení: 2023-08-31, ukončen
Neuronové reprezentace v multimodálním a mnohojazyčném modelování, GAČR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, zahájení: 2019-01-01, ukončení: 2023-12-31, ukončen
Robustní zpracování nahrávek pro operativu a bezpečnost, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, zahájení: 2020-10-01, ukončení: 2025-09-30, ukončen
Výměny pro výzkum řeči a technologií, EU, Horizon 2020, zahájení: 2021-01-01, ukončení: 2025-12-31, řešení
Výzkumné skupiny
Pracoviště
Nahoru