Result Details

Ego4D: Around the World in 3,600 Hours of Egocentric Video

GRAUMAN, K.; WESTBURY, A.; BYRNE, E.; CARTILLIER, V.; CHAVIS, Z.; FURNARI, A.; GIRDHAR, R.; HAMBURGER, J.; JIANG, H.; KUKREJA, D.; LIU, M.; LIU, X.; MARTIN, M.; NAGARAJAN, T.; RADOSAVOVIC, I.; RAMAKRISHNAN, S.; RYAN, F.; SHARMA, J.; WRAY, M.; XU, M.; XU, E.; ZHAO, C.; BANSAL, S.; BATRA, D.; CRANE, S.; DO, T.; DOULATY, M.; ERAPALLI, A.; FEICHTENHOFER, C.; FRAGOMENI, A.; FU, Q.; GEBRESELASIE, A.; GONZALEZ, C.; HILLIS, J.; HUANG, X.; HUANG, Y.; JIA, W.; KHOO, W.; KOLAR, J.; KOTTUR, S.; KUMAR, A.; LANDINI, F.; LI, C.; LI, Y.; LI, Z.; MANGALAM, K.; MODHUGU, R.; MUNRO, J.; MURRELL, T.; NISHIYASU, T.; PRICE, W.; RUIZ PUENTES, P.; RAMAZANOVA, M.; SARI, L.; SOMASUNDARAM, K.; SOUTHERLAND, A.; SUGANO, Y.; TAO, R.; VO, M.; WANG, Y.; WU, X.; YAGI, T.; ZHAO, Z.; ZHU, Y.; ARBELAEZ, P.; CRANDALL, D.; DAMEN, D.; FARINELLA, G.; FUEGEN, C.; GHANEM, B.; KRISHNA, V.; JAWAHAR, C.; JOO, H.; KITANI, K.; LI, H.; NEWCOMBE, R.; OLIVA, A.; PARK, H.; REHG, J.; SATO, Y.; SHI, J.; ZHENG SHOU, M.; TORRALBA, A.; TORRESANI, L.; YAN, M.; MALIK, J. Ego4D: Around the World in 3,600 Hours of Egocentric Video. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, vol. 47, no. 11, p. 9468-9509.
Type
journal article
Language
English
Authors
Grauman Kristen
Westbury Andrew
Byrne Eugene
Cartillier Vincent
Chavis Zachary
Furnari Antonino
Girdhar Rohit
Hamburger Jackson
Jiang Hao
Kukreja Devansh
Liu Miao
Liu Xingyu
Martin Miguel
Nagarajan Tushar
Radosavovic Ilija
Ramakrishnan Santhosh Kumar
Ryan Fiona
Sharma Jayant
Wray Michael
Xu Mengmeng
Xu Eric Zhongcong
Zhao Chen
Bansal Siddhant
Batra Dhruv
Crane Sean
Do Tien
Doulaty Morrie
Erapalli Akshay
Feichtenhofer Christoph
Fragomeni Adriano
Fu Qichen
Gebreselasie Abrham
Gonzalez Cristina
Hillis James
Huang Xuhua
Huang Yifei
Jia Wenqi
Khoo Weslie
Kolar Jachym
Kottur Satwik
Kumar Anurag
Landini Federico Nicolás, Ph.D.
Li Chao
Li Yanghao
Li Zhenqiang
Mangalam Karttikeya
Modhugu Raghava
Munro Jonathan
Murrell Tullie
Nishiyasu Takumi
Price Will
Ruiz Puentes Paola
Ramazanova Merey
Sari Leda
Somasundaram Kiran
Southerland Audrey
Sugano Yusuke
Tao Ruijie
Vo Minh
Wang Yuchen
Wu Xindi
Yagi Takuma
Zhao Ziwei
Zhu Yunyi
Arbelaez Pablo
Crandall David
Damen Dima
Farinella Giovanni Maria
Fuegen Christian
Ghanem Bernard
Krishna Vamsi
Jawahar C. V.
Joo Hanbyul
Kitani Kris
Li Haizhou
Newcombe Richard
Oliva Aude
Park Hyun Soo
Rehg James M.
Sato Yoichi
Shi Jianbo
Zheng Shou Mike
Torralba Antonio
Torresani Lorenzo
Yan Mingfei
Malik Jitendra
Abstract

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations and 9 different countries. The approach to collection is designed to uphold rigorous privacy and ethics standards, with consenting participants and robust de-identification procedures where relevant. Ego4D dramatically expands the volume of diverse egocentric video footage publicly available to the research community. Portions of the video are accompanied by audio, 3D meshes of the environment, eye gaze, stereo, and/or synchronized videos from multiple egocentric cameras at the same event. Furthermore, we present a host of new benchmark challenges centered around understanding the first-person visual experience in the past (querying an episodic memory), present (analyzing hand-object manipulation, audio-visual conversation, and social interactions), and future (forecasting activities). By publicly sharing this massive annotated dataset and benchmark suite, we aim to push the frontier of first-person perception.

Keywords

Cameras, Benchmark testing, Three-dimensional displays, Task analysis, Annotations, Cultural differences, Computer vision, Video understanding, egocentric video, first-person vision, datasets and benchmarks

URL
Published
2025
Pages
9468–9509
Journal
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 11, ISSN
Publisher
IEEE
DOI
UT WoS
001587283400016
EID Scopus
BibTeX
@article{BUT201375,
  author="{} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and Federico Nicolás {Landini} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {} and  {}",
  title="Ego4D: Around the World in 3,600 Hours of Egocentric Video",
  journal="IEEE Transactions on Pattern Analysis and Machine Intelligence",
  year="2025",
  volume="47",
  number="11",
  pages="9468--9509",
  doi="10.1109/TPAMI.2024.3381075",
  issn="0162-8828",
  url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10611736&utm_source=scopus&getft_integrator=scopus&tag=1"
}
Files
Projects
Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-23-8278, start: 2023-03-01, end: 2026-02-28, completed
Research groups
Departments
Back to top