Project Details
Automatic collection and processing of voice data from air-traffic communications
Project Period: 1. 11. 2019 – 28. 2. 2022
Project Type: grant
Agency: Evropská unie
Program: Horizon 2020

air-traffic management, automatic speech recognition, signal processing, legal
and ethical framework
Developing machine learning solutions for air-traffic control applications is
a challenging task. Besides an expert knowledge, large amount of data for robust
performance as well as for validation and verification is typically required. If
funded, ATCO2 will deliver a unique platform enabling to collect, store,
pre-process and share voice communications data recorded from real world
air-traffic control data. The project aims at accessing data from two sources:
(a) from certified ADS-B datalinks aligned with a surveillance technology, and
(b) directly from air-traffic controllers offered to the project by several air
navigation service providers. The technical development will be centred around
the ATCO2 platform, built on an existing and extensively used solution of
opensky-network partner, ensuring sustainability of the platform after the end of
the project. Current platform collects periodically broadcasted aircraft
information through a network of ADS-B receivers operated around the globe,
further stored at a server. In ATCO2, existing platform will be extended to allow
collection, storage and pre-processing of voice communications, and time/position
aligned with other aircraft information. Unlike previous works, we will target
both channels, i.e. spoken commands issued by air-traffic controllers, and
confirmation provided by pilots. In addition to broadcasted data, ATCO2 will have
an access to voice recordings from air navigation service providers, namely
Austrocontrol. This data will simulate other source of speech recordings
(specifically archives), complementing real-time voice communication. The ATCO2
platform will be enhanced by the latest speech pre-processing and machine
learning technologies, mostly based on deep learning. Besides automatic
segmentation (e.g. er speaker, accent, specific command), robust automatic speech
recognition system will be implemented and integrated through RESTful API
allowing to automatically transcribe voice communications.
Kocour Martin, Ing. (DCGM)
Pulugundla Bhargav, M.Sc.
Veselý Karel, Ing., Ph.D. (DCGM)
Žižka Josef, Ing. (DCGM)
2023
- ZULUAGA-GOMEZ, J.; NIGMATULINA, I.; PRASAD, A.; MOTLÍČEK, P.; KHALIL, D.; MADIKERI, S.; TART, A.; SZŐKE, I.; LENDERS, V.; RIGAULT, M.; CHOUKRI, K. Lessons Learned in Transcribing 5000 h of Air Traffic Control Communications for Robust Automatic Speech Understanding. Aerospace, 2023, vol. 2023, no. 10,
p. 1-33. ISSN: 2226-4310. Detail - ZULUAGA-GOMEZ, J.; PRASAD, A.; NIGMATULINA, I.; SARFJOO, S.; MOTLÍČEK, P.; KLEINERT, M.; HELMKE, H.; OHNEISER, O.; ZHAN, Q. How Does Pre-Trained Wav2Vec 2.0 Perform on Domain-Shifted ASR? an Extensive Benchmark on Air Traffic Control Communications. In IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023.
p. 205-212. ISBN: 978-1-6654-7189-3. Detail
2022
- PRASAD, A.; ZULUAGA-GOMEZ, J.; MOTLÍČEK, P.; SARFJOO, S.; NIGMATULINA, I.; OHNEISER, O.; HELMKE, H. Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition. Proceedings of the 12th SESAR Innovation Days. Budapest: 2022.
p. 1-9. Detail - PRASAD, A.; ZULUAGA-GOMEZ, J.; MOTLÍČEK, P.; SARFJOO, S.; NIGMATULINA, I.; VESELÝ, K. Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator. Proceedings of the 12th SESAR Innovation Days. Budapest: 2022.
p. 1-9. Detail
2021
- KOCOUR, M.; CÁMBARA, G.; LUQUE, J.; BONET, D.; FARRÚS, M.; KARAFIÁT, M.; VESELÝ, K.; ČERNOCKÝ, J. BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge. Proceedings of IberSPEECH 2021. Vallaloid: International Speech Communication Association, 2021.
p. 113-117. Detail - SZŐKE, I.; KESIRAJU, S.; NOVOTNÝ, O.; KOCOUR, M.; VESELÝ, K.; ČERNOCKÝ, J. Detecting English Speech in the Air Traffic Control Voice Communication. Proceedings of Interspeech 2021. Brno: 2021.
p. 246-250. Detail
- ZULUAGA-GOMEZ, J.; VESELÝ, K.; SZŐKE, I.; BLATT, A.; MOTLÍČEK, P.; KOCOUR, M.; RIGAULT, M.; CHOUKRI, K.; PRASAD, A.; SARFJOO, S.; NIGMATULINA, I.; CEVENINI, C.; KOLČÁREK, P.; TART, A.; ČERNOCKÝ, J.; KLAKOW, D. ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications has been verified and confirmed by the Action Editor. Journal of Machine Learning Research, vol. 2, no. 1,
p. 1-45. ISSN: 1533-7928. Detail