Creating Artificial Voice Audio - Text-To-Speech for Air Traffic Control

Ohneiser, Oliver und Shetty, Shruthi und Ehr, Heiko und Hobein, Stephanie (2025) Creating Artificial Voice Audio - Text-To-Speech for Air Traffic Control. DLR-Interner Bericht. DLR-IB-FL-BS-2025-37. (nicht veröffentlicht)

Dieses Archiv kann nicht den Volltext zur Verfügung stellen.

Kurzfassung

Automatic speech recognition and understanding (ASRU) has proven to effectively support human operators in air traffic control (ATC) - especially through reducing workload in case automated functions take over former human tasks. Still, the verbal communication remains a time-dominant function in ATC communication. After the COVID-19 pandemic, the aviation industry faced a shortage of air traffic controllers (ATCos) and pilots, highlighting a significant problem: managing resources for training new air traffic controllers (ATCos) and pilots. Therefore, we explore further technologies in addition to ASRU that can support operational work and training of ATCos and pilots focussing on the communication aspect with speech. Given the dramatic improvements of speech-to-text (STT) and text-to-speech (TTS) models in recent years, we herein analyse how TTS can support human aviation operators. Therefore, we (1) create new TTS models through fine-tuning of existing out-of-domain models, (2) generate new synthetic audio data to be used ad-hoc for verbalizing ATC utterances or to be used for training/fine-tuning of STT models, and (3) describe future applications of 'Speech Understanding, Generation, and Recognition' (SUGAR) in ATC such as supporting simulation pilots with readback generation and supporting ATCos with voice command generation. This report is a guide on how to perform the three listed steps in-house to customize TTS models, artificial audio utterances, and downstream applications of SUGAR.

elib-URL des Eintrags:

https://elib.dlr.de/214878/

Dokumentart:

Berichtsreihe (DLR-Interner Bericht)

Titel:

Creating Artificial Voice Audio - Text-To-Speech for Air Traffic Control

Autoren:

Autoren	Institution oder E-Mail-Adresse	Autoren-ORCID-iD	ORCID Put Code
Ohneiser, Oliver	Oliver.Ohneiser (at) dlr.de	https://orcid.org/0000-0002-5411-691X	NICHT SPEZIFIZIERT
Shetty, Shruthi	shruthi.shetty (at) dlr.d	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Ehr, Heiko	Heiko.Ehr (at) dlr.de	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Hobein, Stephanie	stephanie.hobein (at) dlr.de	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT

Datum:

2025

Referierte Publikation:

Nein

Open Access:

Nein

Status:

nicht veröffentlicht

Stichwörter:

Text-To-Speech; Speech-To-Text

HGF - Forschungsbereich:

Luftfahrt, Raumfahrt und Verkehr

HGF - Programm:

Luftfahrt

HGF - Programmthema:

Luftverkehr und Auswirkungen

DLR - Schwerpunkt:

Luftfahrt

DLR - Forschungsgebiet:

L AI - Luftverkehr und Auswirkungen

DLR - Teilgebiet (Projekt, Vorhaben):

L - Integrierte Flugführung

Standort:

Braunschweig

Institute & Einrichtungen:

Institut für Flugführung > Lotsenassistenz
Institut für Flugführung > ATM-Simulation

Hinterlegt von:

Ohneiser, Oliver

Hinterlegt am:

06 Aug 2025 11:31

Letzte Änderung:

06 Aug 2025 11:31

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags