elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Barrierefreiheit | Kontakt | English
Schriftgröße: [-] Text [+]

Can YouTube Stream Recordings Improve Automatic Speech Recognition for Air Traffic Control?

Wüstenbecker, Niclas und Ohneiser, Oliver und Kleinert, Matthias (2025) Can YouTube Stream Recordings Improve Automatic Speech Recognition for Air Traffic Control? In: 13th OpenSky Symposium. 13th OpenSky Symposium, 2025-11-06 - 2025-11-07, Norrköping, Schweden.

Dieses Archiv kann nicht den Volltext zur Verfügung stellen.

Kurzfassung

Automatic speech recognition for air traffic control (ATC) faces severe training data scarcity due to operational recording restrictions and expensive domain-expert transcription requirements. We address this limitation by developing an automated pipeline that extracts large-scale, high-quality training data from publicly available YouTube streams of virtual ATC simulator sessions from networks such as VATSIM and IVAO. Our approach systematically processes over 2,000 hours of content spanning 709 videos from virtual airports in 17 countries across multiple continents, operational domains (ground, tower, approach, en-route), and diverse speaker accents. The pipeline employs speaker diarization for utterance segmentation, parallel transcription using three complementary ASR architectures with distinct error characteristics, and Large Language Model-based transcript fusion that synthesizes improved pseudo-labels while filtering non-ATC content. Manual verification on a stratified 120-minute evaluation set demonstrates 10.2% word error rate for controller speech and 18.3% for pilot speech - representing 37% relative improvement over the best individual model and establishing pseudo-label quality sufficient for downstream model training. We show the feasibility of this approach by training a compact 115M-parameter ASR model exclusively on automatically generated transcripts without any manually annotated operational data. Evaluation on the operational ATCO2 benchmark reveals 21.1% word error rate compared to 35.6% for published baselines trained on smaller manually-transcribed datasets, despite the domain gap between virtual and operational ATC, while achieving approximately 5x faster inference. These results demonstrate that large-scale geographically and acoustically diverse, pseudo-labeled data can effectively compensate for moderate label noise when training specialized-domain speech recognition systems. We openly release the complete processing pipeline, curated video collection, and our trained model to enable reproducible research.

elib-URL des Eintrags:https://elib.dlr.de/219501/
Dokumentart:Konferenzbeitrag (Vortrag)
Titel:Can YouTube Stream Recordings Improve Automatic Speech Recognition for Air Traffic Control?
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Wüstenbecker, Niclasniclas.wuestenbecker (at) dlr.dehttps://orcid.org/0009-0000-3440-8635197748513
Ohneiser, OliverOliver.Ohneiser (at) dlr.dehttps://orcid.org/0000-0002-5411-691X197748514
Kleinert, MatthiasMatthias.Kleinert (at) dlr.dehttps://orcid.org/0000-0002-0782-4147NICHT SPEZIFIZIERT
Datum:November 2025
Erschienen in:13th OpenSky Symposium
Referierte Publikation:Ja
Open Access:Nein
Gold Open Access:Nein
In SCOPUS:Nein
In ISI Web of Science:Nein
Status:veröffentlicht
Stichwörter:Air Traffic Control; Automatic Speech Recognition; Public Dataset; Large Language Model;
Veranstaltungstitel:13th OpenSky Symposium
Veranstaltungsort:Norrköping, Schweden
Veranstaltungsart:internationale Konferenz
Veranstaltungsbeginn:6 November 2025
Veranstaltungsende:7 November 2025
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Luftfahrt
HGF - Programmthema:Luftverkehr und Auswirkungen
DLR - Schwerpunkt:Luftfahrt
DLR - Forschungsgebiet:L AI - Luftverkehr und Auswirkungen
DLR - Teilgebiet (Projekt, Vorhaben):L - Integrierte Flugführung
Standort: Braunschweig
Institute & Einrichtungen:Institut für Flugführung > Lotsenassistenz
Hinterlegt von: Wüstenbecker, Niclas
Hinterlegt am:24 Nov 2025 10:02
Letzte Änderung:24 Nov 2025 10:02

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
OpenAIRE Validator logo electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.