Detail výsledku

Speech and Language Recognition with Low-rank Adaptation of Pretrained Models

PRASAD, A.; MADIKERI, S.; KHALIL, D.; MOTLÍČEK, P.; SCHUEPBACH, C. Speech and Language Recognition with Low-rank Adaptation of Pretrained Models. In Proceedings of Interspeech. Proceedings of Interspeech. Kos Island: International Speech Communication Association, 2024. no. 9, p. 2825-2829. ISSN: 1990-9772.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Prasad Amrutha
Madikeri Srikanth, FIT (FIT)
KHALIL, D.
Motlíček Petr, doc. Ing., Ph.D., UPGM (FIT)
SCHUEPBACH, C.
Abstrakt

Finetuning large pretrained models demands considerable
computational resources, posing practical constraints. Major-
ity of the total number of parameters in these models are used
by fully connected layers. In this work, we consider applying
a semi-orthogonal constraint, followed by full finetuning to the
fully connected layers reduces model parameters significantly
without sacrificing efficacy in downstream tasks. Specifically,
we consider wav2vec2.0 XLS-R and Whisper models for Auto-
matic Speech Recognition and Language Recognition. Our re-
sults show that we can reduce the model size by approximately
24% during both training and inference time with 0.7% absolute
drop in performance for XLS-R and no drop in performance for
Whisper for ASR. In combination with performance-efficient
training with low-rank adapters, the resource requirements for
training can be further reduced by up to 90%.

Klíčová slova

parameter reduction, language identification, speech recognition, wav2vec2.0

URL
Rok
2024
Strany
2825–2829
Časopis
Proceedings of Interspeech, roč. 2024, č. 9, ISSN 1990-9772
Sborník
Proceedings of Interspeech
Konference
Interspeech Conference
Vydavatel
International Speech Communication Association
Místo
Kos Island
DOI
EID Scopus
BibTeX
@inproceedings{BUT193370,
  author="PRASAD, A. and MADIKERI, S. and KHALIL, D. and MOTLÍČEK, P. and SCHUEPBACH, C.",
  title="Speech and Language Recognition with Low-rank Adaptation of Pretrained Models",
  booktitle="Proceedings of Interspeech",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="2825--2829",
  publisher="International Speech Communication Association",
  address="Kos Island",
  doi="10.21437/Interspeech.2024-2187",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/prasad24_interspeech.html"
}
Soubory
Projekty
Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat, VUT, Vnitřní projekty VUT, FIT-S-23-8278, zahájení: 2023-03-01, ukončení: 2026-02-28, řešení
Výzkumné skupiny
Pracoviště
Nahoru