Result Details

Measuring Speech Recognition And Understanding Performance in Air Traffic Control Domain Beyond Word Error Rates

HELMKE, H.; SHETTY, S.; KLEINERT, M.; OHNEISER, O.; EHR, H.; MOTLÍČEK, P.; PRASAD, A.; WINDISCH, C. Measuring Speech Recognition And Understanding Performance in Air Traffic Control Domain Beyond Word Error Rates. In Proceedings of 11th SESAR Innovation Days 2021. Belgie: 2021. p. 1-8.

Type

conference paper

Language

English

Authors

HELMKE, H.
SHETTY, S.
KLEINERT, M.
OHNEISER, O.
EHR, H.
Motlíček Petr, doc. Ing., Ph.D., DCGM (FIT)
Prasad Amrutha, DCGM (FIT)
WINDISCH, C.
and others

Abstract

Applying Automatic Speech Recognition (ASR) in the
domain of analogue voice communication between air traffic controllers
(ATCo) and pilots has more end user requirements than
just transforming spoken words into text. It is useless for, e.g.,
readback error detection support, if word recognition is perfect,
as long as the semantic interpretation is wrong. For an ATCo it is
of almost no importance if the words of a greeting are correctly
recognized. A wrong recognition of a greeting should, however, not
disturb the correct recognition of, e.g., a descend command.
More important is the correct semantic interpretation. What, however,
is the correct semantic interpretation especially when ATCos
or pilot, deviate more of less from published standard phraseology?
For comparing performance of different speech recognition
applications, 14 European partners from Air Traffic Management
(ATM) domain have recently agreed on a common set of rules, i.e.,
an ontology on how to annotate the speech utterances of an ATCo
on semantic level. This paper first presents the new metric of unclassified
word rate, extends the ontology to pilot utterances, and
introduces the metrics of command recognition rate, command
recognition error rate, and command recognition rejection rate.
This enables the comparison of different speech recognition and
understanding instances on semantic level. The implementation
used in this paper achieves a command recognition rate better
than 96% for Prague Approach, even if word error rate is above
2.5% based on more than 12,000 ATCo commands recorded in
both operational and lab environment. This outperforms previous
published rates by 2% absolute.

Keywords

word error rate, command recognition rate, language
understanding, air traffic control, ATC, unclassified word rate

URL

https://www.fit.vut.cz/research/group/speech/public/publi/2021/helmke_sesar2021… PDF

Published

2021

Pages

1–8

Proceedings

Proceedings of 11th SESAR Innovation Days 2021

Conference

11th SESAR Innovation Days

Place

Belgie

EID Scopus

2-s2.0-85160724873

BibTeX

@inproceedings{BUT176486,
  author="HELMKE, H. and SHETTY, S. and KLEINERT, M. and OHNEISER, O. and EHR, H. and MOTLÍČEK, P. and PRASAD, A. and WINDISCH, C.",
  title="Measuring Speech Recognition And Understanding Performance in Air Traffic Control Domain Beyond Word Error Rates",
  booktitle="Proceedings of 11th SESAR Innovation Days 2021",
  year="2021",
  pages="1--8",
  address="Belgie",
  url="https://www.fit.vut.cz/research/publication/12684/"
}

Files

pdf helmke_sesar2021_Innovation Days_SIDs_paper_2.pdf 2 MB

Projects

HAAWAII - Highly Automated Air Traffic Controller Workstations with Artificial Intelligence Integration, EU, Horizon 2020, H2020-SESAR-2019-2, start: 2020-06-01, end: 2022-11-30, completed
Moderní metody zpracování, analýzy a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-20-6460, start: 2020-03-01, end: 2023-02-28, completed

Research groups

Výzkumná skupina dolování dat z řeči BUT Speech@FIT (RG SPEECH)

Departments

Ústav počítačové grafiky a multimédií (DCGM)