elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Barrierefreiheit | Kontakt | English
Schriftgröße: [-] Text [+]

Evaluating ASR improvements for digital documentation in laboratories via fine-tuning, semantics, and post-Processing

Gokhale, Manasi und Hassan, Teena Chakkalayil und Houben, Sebastian (2026) Evaluating ASR improvements for digital documentation in laboratories via fine-tuning, semantics, and post-Processing. Masterarbeit, Hochschule Bonn-Rhein-Sieg.

Dieses Archiv kann nicht den Volltext zur Verfügung stellen.

Kurzfassung

Automatic Speech Recognition (ASR) models often struggle with domain-specific tasks because they have limited knowledge of the terminologies and contexts unique to those domains. Therefore, improving performance in such domains requires adapting ASR models using relevant supervised data. Such data is not readily available and can be time-consuming to produce.

This thesis examines the effective adaptation of ASR models for the chemistry and scientific laboratory domain. To address data scarcity, a domain-specific dataset was created, consisting of both real and synthetic audio samples for training, as well as a separate evaluation set. Several ASR models, including Vosk, Wav2Vec 2, SpeechT5, and Whisper, were evaluated to identify the most suitable baseline model for further adaptation. The Whisper-large-v2 model demonstrated the strongest performance and was selected for subsequent improvement.

Two complementary adaptation strategies were explored. One was fine-tuning the Whisper model on domain-specific data, and another was post-processing ASR outputs using Large Language Models (LLMs). Fine-tuning provided modest performance gains, while a dedicated LLM-based correction pipeline, which was enhanced with terminology derived from domain ontologies, yielded substantial improvements in transcription accuracy and contextual consistency.

Overall, the thesis contributes (i) a domain-specific dataset, (ii) a comprehensive analysis of ASR models, and (iii) effective strategies for adapting ASR systems to specialized scientific domains. These findings highlight practical pathways for improving ASR performance in specialized domains.

elib-URL des Eintrags:https://elib.dlr.de/223736/
Dokumentart:Hochschulschrift (Masterarbeit)
Titel:Evaluating ASR improvements for digital documentation in laboratories via fine-tuning, semantics, and post-Processing
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Gokhale, Manasimanasi.gokhale (at) dlr.dehttps://orcid.org/0009-0002-6729-8107NICHT SPEZIFIZIERT
Hassan, Teena Chakkalayilteena.hassan (at) h-brs.deNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Houben, Sebastiansebastian.houben (at) h-brs.deNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
DLR-Supervisor:
BeitragsartDLR-SupervisorInstitution oder E-Mail-AdresseDLR-Supervisor-ORCID-iD
Thesis advisorDembska, MartaMarta.Dembska (at) dlr.dehttps://orcid.org/0000-0002-8180-1525
Datum:2026
Open Access:Nein
Seitenanzahl:77
Status:eingereichter Beitrag
Stichwörter:Automatic Speech Recognition (ASR), Domain ontologies, Post-processing correction, Speech transcription
Institution:Hochschule Bonn-Rhein-Sieg
Abteilung:Computer Science
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Luftfahrt
HGF - Programmthema:keine Zuordnung
DLR - Schwerpunkt:Luftfahrt
DLR - Forschungsgebiet:L - keine Zuordnung
DLR - Teilgebiet (Projekt, Vorhaben):L - keine Zuordnung
Standort: Jena
Institute & Einrichtungen:Institut für Datenwissenschaften > Datenmanagement und -aufbereitung
Hinterlegt von: Gokhale, Manasi
Hinterlegt am:13 Apr 2026 16:21
Letzte Änderung:13 Apr 2026 16:21

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
OpenAIRE Validator logo electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.