Result Details

VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION

MOTLÍČEK, P.; BURGET, L.; ČERNOCKÝ, J. VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION. Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005. p. 187-190. ISBN: 80-214-2904-6.
Type
conference paper
Language
English
Authors
Abstract

This paper proposes a bimodal speech recognition scheme using visual parameters extracted from meeting recordings.

Keywords

speech recognition, feature extraction, parameterization, visual features, linear transforms, meeting data

URL
Annotation

This paper demonstrates the use of visual parameters extracted from video for automatic recognition of phoneme strings. Encouraged by previous works utilizing "visually clean" data we investigate their efficiency in non-ideal conditions which are introduced by meeting audio-visual data employed in our experiments.

Published
2005
Pages
187–190
Proceedings
Radioelektronika 2005
Conference
15th International Czech-Slovak Scientific conference Radioelektronika 2005
ISBN
80-214-2904-6
Publisher
Faculty of Electrical Engineering and Communication BUT
Place
Brno
BibTeX
@inproceedings{BUT21499,
  author="Petr {Motlíček} and Lukáš {Burget} and Jan {Černocký}",
  title="VISUAL FEATURES FOR MULTIMODAL SPEECH RECOGNITION",
  booktitle="Radioelektronika 2005",
  year="2005",
  pages="187--190",
  publisher="Faculty of Electrical Engineering and Communication BUT",
  address="Brno",
  isbn="80-214-2904-6",
  url="https://www.fit.vut.cz/research/publication/7784/"
}
Projects
Augmented Multi-party Interaction, EU, Sixth Framework programme, 506811-AMI, start: 2004-01-01, end: 2006-12-31, completed
Data driven and anthropic coding and recognition of speech, GACR, Postdoktorandské granty, GP102/02/D108, start: 2002-09-01, end: 2005-08-30, completed
Research groups
Departments
Back to top