Publication Details

Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers

AL-HAMES, M.; HAIN, T.; ČERNOCKÝ, J.; SCHREIBER, S.; POEL, M.; MÜLLER, R.; MARCEL, S.; VAN LEEUWEN, D.; ODOBEZ, J.; BA, S.; BOURLARD, H.; CARDINAUX, F.; GATICA-PEREZ, D.; JANIN, A.; MOTLÍČEK, P.; REITER, S.; RENALS, S.; VAN REST, J.; RIENKS, R.; RIGOLL, G.; SMITH, K.; THEAN, A.; ZEMČÍK, P. Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers. Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). Washington D.C.: 2006. p. 1-12.

Czech title

Audiovizuální zpracování meetingů - sedm otázek a odpovědí projektu AMI

Type

conference paper

Language

English

Authors

Al-Hames Marc
Hain Thomas
Černocký Jan, prof. Dr. Ing. (DCGM)
Schreiber Sascha
Poel Mannes
Müller Ronald
Marcel Sebastien
van Leeuwen David
Odobez Jean-Marc
Ba Sileye
Bourlard Herve
Cardinaux Fabien
Gatica-Perez Daniel
Janin Adam
Motlíček Petr, doc. Ing., Ph.D. (DCGM)
Reiter Stephan
Renals Steve, prof.
van Rest Jeroen
Rienks Rutger
Tetzlaff Ronald, prof.
Smith Kevin
Thean Andrew
Zemčík Pavel, prof. Dr. Ing., dr. h. c. (DCGM)

URL

http://www.fit.vutbr.cz/~cernocky/publi/2006/wp4_mlmi_final.pdf

Keywords

speech processing, video processing, multi-modal interaction

Abstract

The paper is on Audio-Visual Processing in Meetings: it asks Seven Questions and presents Current AMI Answers

Annotation

The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms - and the required component technologies R and D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer interaction. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like "Who is acting during the meeting?". We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions

Published

2006

Pages

1–12

Proceedings

Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)

Conference

3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Washington, US

Place

Washington D.C.

BibTeX

@inproceedings{BUT27601,
  author="Marc {Al-Hames} and Thomas {Hain} and Jan {Černocký} and Sascha {Schreiber} and Mannes {Poel} and Ronald {Müller} and Sebastien {Marcel} and David {van Leeuwen} and Jean-Marc {Odobez} and Sileye {Ba} and Herve {Bourlard} and Fabien {Cardinaux} and Daniel {Gatica-Perez} and Adam {Janin} and Petr {Motlíček} and Stephan {Reiter} and Steve {Renals} and Jeroen {van Rest} and Rutger {Rienks} and Ronald {Tetzlaff} and Kevin {Smith} and Andrew {Thean} and Pavel {Zemčík}",
  title="Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers",
  booktitle="Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)",
  year="2006",
  pages="1--12",
  address="Washington D.C.",
  url="http://www.fit.vutbr.cz/~cernocky/publi/2006/wp4_mlmi_final.pdf"
}