Publication Details

Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers

AL-HAMES Marc, HAIN Thomas, ČERNOCKÝ Jan, SCHREIBER Sascha, POEL Mannes, MÜLLER Ronald, MARCEL Sebastien, VAN Leeuwen David, ODOBEZ Jean-Marc, BA Sileye, BOURLARD Herve, CARDINAUX Fabien, GATICA-PEREZ Daniel, JANIN Adam, MOTLÍČEK Petr, REITER Stephan, RENALS Steve, VAN Rest Jeroen, RIENKS Rutger, RIGOLL Gerhard, SMITH Kevin, THEAN Andrew and ZEMČÍK Pavel. Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers. In: Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). Washington D.C., 2006, p. 12.
Type
conference paper
Language
english
Authors
Al-Hames Marc (TUM)
Hain Thomas (USF)
Černocký Jan, doc. Dr. Ing. (DCGM FIT BUT)
Schreiber Sascha (TUM)
Poel Mannes (UTWENTE)
Müller Ronald (TUM)
Marcel Sebastien (IDIAP)
van Leeuwen David (TNO TPD)
Odobez Jean-Marc (IDIAP)
Ba Sileye (IDIAP)
Bourlard Herve (IDIAP)
Cardinaux Fabien (IDIAP)
Gatica-Perez Daniel (IDIAP)
Janin Adam (ICSI Berkeley)
Motlíček Petr, Ing., Ph.D. (DCGM FIT BUT)
Reiter Stephan (TUM)
Renals Steve (UEDIN)
van Rest Jeroen (TNO TPD)
Rienks Rutger (UTWENTE)
Rigoll Gerhard, Prof. Dr.-Ing. (TUM)
Smith Kevin (IDIAP)
Thean Andrew (TNO TPD)
Zemčík Pavel, prof. Dr. Ing. (DCGM FIT BUT)
URL
Keywords

speech processing, video processing, multi-modal interaction

Abstract

The paper is on Audio-Visual Processing in Meetings: it asks Seven Questions and presents Current AMI Answers

Annotation

The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms - and the required component technologies R and D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer interaction. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like "Who is acting during the meeting?". We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions

Published
2006
Pages
12
Proceedings
Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)
Conference
3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Washington, US
Place
Washington D.C., US
BibTeX
@INPROCEEDINGS{FITPUB8237,
   author = "Marc Al-Hames and Thomas Hain and Jan \v{C}ernock\'{y} and Sascha Schreiber and Mannes Poel and Ronald M{\"{u}}ller and Sebastien Marcel and David Leeuwen van and Jean-Marc Odobez and Sileye Ba and Herve Bourlard and Fabien Cardinaux and Daniel Gatica-Perez and Adam Janin and Petr Motl\'{i}\v{c}ek and Stephan Reiter and Steve Renals and Jeroen Rest van and Rutger Rienks and Gerhard Rigoll and Kevin Smith and Andrew Thean and Pavel Zem\v{c}\'{i}k",
   title = "Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers",
   pages = 12,
   booktitle = "Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)",
   year = 2006,
   location = "Washington D.C., US",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/8237"
}
Back to top