SW3 ASR pro akusticky náročná prostředí

English title

SW3 ASR for demanding acoustic conditions

Type

software

License

The result is being used by the owner

License Fee

The licensor does not require a license fee for the result

Authors

Šmíd Luboš, Ing., Ph.D.
Karafiát Martin, Ing., Ph.D. (DCGM)
Švec Jan, Ing., Ph.D.
Lehečka Jan
Mošner Ladislav, Ing. (DCGM)
Brukner Jan, Ing. (DCGM)

Keywords

ASR; speech recognition; docker

Description

An Asian language speech recognition (ASR) system based on modern training approaches. The WAV2VEC model was trained on general recordings and retrained on Vietnamese recordings, further extended by data augmentation for demanding acoustic conditions. This achieved the desired robustness. Part of the result is a model for removing noise from the recording (deNoiser). The result is an application that uses a "Docker" container and can be run from the command line on a standard Linux or Windows distribution.

Location

Pro stažení kontaktujte: https://www.fit.vut.cz/person/karafiat/ nebo http://www.kky.zcu.cz/en/people/smidl-lubos

Projects

Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, running

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Computer Graphics and Multimedia (DCGM)