May, Moritz und Kleinert, Matthias und Helmke, Hartmut (2024) Automatic Transcription of Air Traffic Controller to Pilot Communication - Training Speech Recognition Models with the Open Source Toolkit CoquiSTT. In: DLRK, Deutscher Luft- und Raumfahrtkongress 2024, Seiten 1-8. DLRK, Deutscher Luft- und Raumfahrtkongress 2024, 2024-09-30 - 2024-10-02, Hamburg, Deutschland. doi: 10.25967/630171.
PDF
1MB |
Offizielle URL: https://www.dglr.de/publikationen/2024/630171.pdf
Kurzfassung
Despite all the advances in automation and digitalization the majority of communication between air traffic controllers and pilots is still implemented via analogue radio voice transmissions. If support systems also want to benefit from the verbal controller-pilot-communication, manual time-consuming inputs via mouse and keyboard are required. Automatic speech recognition (ASR) is a solution to minimize these manual inputs. Recently DLR, Idiap and Austro Control demonstrated that pre-filling of radar label entries supported by ASR already reaches a technology readiness level of six. The used ASR engine is based on Kaldi, which requires high expert knowledge of ASR for implementation and adaptation. Besides Kaldi a lot of open-source end-to- end ASR models like Whisper or wav2vec are available and are already pre-trained on large amounts of data of normal voice communication. These open source end-to-end models are often easier to adapt even for none speech recognition experts. This paper presents the results, which the DLR achieved with the open-source CoquiSTT toolkit, which provides an already pre-trained English end-to-end model with 47,000 hours of regular English speech achieving a word error rate of 4.5% on the LibriSpeech clean test corpus. Using the model, however, on air traffic control voice communication results in word error rates of worse than 50%, even in lab environments. Training new models from scratch just on 10 hours of voice recordings from the target environment already makes word error rates below 10% possible. The best performance, however, is achieved, when the CoquiSTT pre-trained model is fine-tuned with air traffic control data from different areas. Word error rates below 5% were achieved, which enable, e.g., callsign recognition rates of better than 95%.
elib-URL des Eintrags: | https://elib.dlr.de/211008/ | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Dokumentart: | Konferenzbeitrag (Vortrag) | ||||||||||||||||
Titel: | Automatic Transcription of Air Traffic Controller to Pilot Communication - Training Speech Recognition Models with the Open Source Toolkit CoquiSTT | ||||||||||||||||
Autoren: |
| ||||||||||||||||
Datum: | 13 Dezember 2024 | ||||||||||||||||
Erschienen in: | DLRK, Deutscher Luft- und Raumfahrtkongress 2024 | ||||||||||||||||
Referierte Publikation: | Ja | ||||||||||||||||
Open Access: | Ja | ||||||||||||||||
Gold Open Access: | Nein | ||||||||||||||||
In SCOPUS: | Nein | ||||||||||||||||
In ISI Web of Science: | Nein | ||||||||||||||||
DOI: | 10.25967/630171 | ||||||||||||||||
Seitenbereich: | Seiten 1-8 | ||||||||||||||||
Status: | veröffentlicht | ||||||||||||||||
Stichwörter: | CoquiSTT, ASR, ASRU, ATC | ||||||||||||||||
Veranstaltungstitel: | DLRK, Deutscher Luft- und Raumfahrtkongress 2024 | ||||||||||||||||
Veranstaltungsort: | Hamburg, Deutschland | ||||||||||||||||
Veranstaltungsart: | nationale Konferenz | ||||||||||||||||
Veranstaltungsbeginn: | 30 September 2024 | ||||||||||||||||
Veranstaltungsende: | 2 Oktober 2024 | ||||||||||||||||
Veranstalter : | DGLR, Deutsche Gesellschaft für Luft- und Raumfahrt | ||||||||||||||||
HGF - Forschungsbereich: | Luftfahrt, Raumfahrt und Verkehr | ||||||||||||||||
HGF - Programm: | Luftfahrt | ||||||||||||||||
HGF - Programmthema: | Luftverkehr und Auswirkungen | ||||||||||||||||
DLR - Schwerpunkt: | Luftfahrt | ||||||||||||||||
DLR - Forschungsgebiet: | L AI - Luftverkehr und Auswirkungen | ||||||||||||||||
DLR - Teilgebiet (Projekt, Vorhaben): | L - Integrierte Flugführung | ||||||||||||||||
Standort: | Braunschweig | ||||||||||||||||
Institute & Einrichtungen: | Institut für Flugführung > Lotsenassistenz | ||||||||||||||||
Hinterlegt von: | May, Moritz | ||||||||||||||||
Hinterlegt am: | 18 Dez 2024 09:13 | ||||||||||||||||
Letzte Änderung: | 18 Dez 2024 09:13 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags