Result Details

Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters

PENG, J.; STAFYLAKIS, T.; GU, R.; PLCHOT, O.; MOŠNER, L.; BURGET, L.; ČERNOCKÝ, J. Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island: IEEE Signal Processing Society, 2023. p. 1-5. ISBN: 978-1-7281-6327-7.

Type

conference paper

Language

English

Authors

Peng Junyi, DCGM (FIT)
Stafylakis Themos
GU, R.
Plchot Oldřich, Ing., Ph.D., DCGM (FIT)
Mošner Ladislav, Ing., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)

Abstract

Recently, the pre-trained Transformer models have received a rising
interest in the field of speech processing thanks to their great success
in various downstream tasks. However, most fine-tuning approaches
update all the parameters of the pre-trained model, which becomes
prohibitive as the model size grows and sometimes results in over-
fitting on small datasets. In this paper, we conduct a comprehensive
analysis of applying parameter-efficient transfer learning (PETL)
methods to reduce the required learnable parameters for adapting
to speaker verification tasks. Specifically, during the fine-tuning
process, the pre-trained models are frozen, and only lightweight
modules inserted in each Transformer block are trainable (a method
known as adapters). Moreover, to boost the performance in a cross-
language low-resource scenario, the Transformer model is further
tuned on a large intermediate dataset before directly fine-tuning it
on a small dataset. With updating fewer than 4% of parameters, (our
proposed) PETL-based methods achieve comparable performances
with full fine-tuning methods (Vox1-O: 0.55%, Vox1-E: 0.82%,
Vox1-H:1.73%).

Keywords

Speaker verification, pre-trained model, adapter, fine-tuning, transfer learning

URL

Published

2023

Pages

1–5

Proceedings

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Conference

2023 IEEE International Conference on Acoustics, Speech and Signal Processing IEEE

ISBN

978-1-7281-6327-7

Publisher

IEEE Signal Processing Society

Place

Rhodes Island

DOI

10.1109/ICASSP49357.2023.10094795

EID Scopus

2-s2.0-85177568701

BibTeX

@inproceedings{BUT185200,
  author="PENG, J. and STAFYLAKIS, T. and GU, R. and PLCHOT, O. and MOŠNER, L. and BURGET, L. and ČERNOCKÝ, J.",
  title="Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters",
  booktitle="ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
  year="2023",
  pages="1--5",
  publisher="IEEE Signal Processing Society",
  address="Rhodes Island",
  doi="10.1109/ICASSP49357.2023.10094795",
  isbn="978-1-7281-6327-7",
  url="https://ieeexplore.ieee.org/document/10094795"
}

Files

pdf peng_icassp2023_10094795.pdf 2 MB

Projects

Exchanges for SPEech ReseArch aNd TechnOlogies, EU, Horizon 2020, start: 2021-01-01, end: 2025-12-31, running
Multi-linguality in speech technologies, MŠMT, INTER-EXCELLENCE - Podprogram INTER-ACTION, LTAIN19087, start: 2020-01-01, end: 2023-08-31, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, completed

Research groups

Výzkumná skupina dolování dat z řeči BUT Speech@FIT (RG SPEECH)

Departments

Ústav počítačové grafiky a multimédií (DCGM)