Multi-lingualita v řečových technologiích
Project Period: 1. 1. 2020 - 31. 8. 2023
Project Type: grant
Program: INTER-EXCELLENCE - Podprogram INTER-ACTION
multi-linguality, speech recognition, machine learning, data, transfer learning
Speech data mining technologies and human-machine interfaces based on speech have witnessed significant advances in the past decade and numerous applications have been successfully commercialized. However, they usually work correctly only in favorable scenarios - in languages with abundance of training data and in relatively clean environments, such as office or apartment. In fast developing big markets such as the Indian one, severe problems make the exploitation of speech difficult: multitude of languages (some of them with limited or missing resources), highly noisy conditions (lots of business is simply done on the streets in Indian cities), and highly variable numbers of speakers in a conversation (from normal two to whole families). These make the development of automatic speech recognition (ASR), speaker recognition (SR) and speaker diarization (determining who spoke when, SD) complicated. In the proposed project, two established research institutes with significant track multi-lingual ASR, robust SR and SD: Brno University of Technology (BUT), IIT Madras (IIT-M) have teamed up with an important player on the Indian and global personal electronics markets - Samsung R&D Institute India-Bangalore (SRI-B), and propose significant advances in several speech technologies, notably in multi-lingual low-resource ASR. While BUT and IIT-M will provide top speech research (based, among others, on the U.S. IARPA Babel and Material programs, victory in IARPA ASpIRE evaluation and in Interspeech 2018 Low Resource Speech Recognition Challenge for Indian Languages, and on Indian MANDI project), SRI-B will provide data, industrial guidelines and to produce demonstrators of technologies.