elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Kontakt | English
Schriftgröße: [-] Text [+]

Integrating Domain Knowledge into Transformer-based Approaches to Vulnerability Detection

Jeong, Seunghee (2023) Integrating Domain Knowledge into Transformer-based Approaches to Vulnerability Detection. Masterarbeit, Ludwig-Maximilians-Universität München.

[img] PDF
6MB

Kurzfassung

The field of vulnerability detection in cybersecurity is critical for ensuring the security and integrity of software systems. Traditional methods like Static Application Security Testing (SAST) and Dynamic Application Security Testing (DAST) have limitations. SAST, while effective in identifying vulnerabilities early in the development cycle, often produces a high rate of false positives and struggles to understand the runtime context. DAST, on the other hand, can detect vulnerabilities in a running application but is limited by its inability to access the source code and its late detection in the software lifecycle. In contrast, the landscape of vulnerability detection has evolved significantly, embracing advanced machine learning models. Initially, the focus was on Recurrent Neural Network (RNN)-based models such as LSTM, BiLSTM, and BiGRU, along with their variants in Convolutional Neural Network (CNN)-based methodologies. However, the field has recently shifted towards transformer-based models, noted for their exceptional performance in natural language processing tasks and their proficiency in interpreting programming languages. This study leverages the strengths of transformer-based models, particularly those tailored for programming languages, to enhance vulnerability detection. By integrating domain knowledge, specifically the Common Weakness Enumeration (CWE) hierarchy, into programming languagespecific Transformer-based models. In this study, we investigate the efficacy of transformer-based models through two distinct classification approaches: standard classification and hierarchical classification using a deep classifier. Our primary objective is to assess the impact of integrating domain knowledge, particularly in the context of hierarchical methods, on model performance. This exploration aims to delineate how such integration influences outcomes compared to traditional classification methods, thereby providing insights into the potential advantages of domain-specific enhancements in transformer-based models by adding a novel dimension to the semantic and syntactic analysis of source code. Our hierarchical approach using various loss weights outperformed the standard classification with Focal Loss in multiclass classification. Also, these approaches showed high performances in binary classification even though the models were fine-tuned for multiclass classification task and not for binary classification task. This represents our approaches enable broader learning of semantic and synthetic knowledge in vulnerability detection tasks using transformer-based models and suggests promising direction for future research and application in the field.

elib-URL des Eintrags:https://elib.dlr.de/201141/
Dokumentart:Hochschulschrift (Masterarbeit)
Titel:Integrating Domain Knowledge into Transformer-based Approaches to Vulnerability Detection
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Jeong, Seungheeseunghee.jeong (at) dlr.deNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Datum:30 November 2023
Referierte Publikation:Nein
Open Access:Ja
Gold Open Access:Nein
In SCOPUS:Nein
In ISI Web of Science:Nein
Seitenanzahl:81
Status:veröffentlicht
Stichwörter:Vulnerability Detection; Machine Learning
Institution:Ludwig-Maximilians-Universität München
Abteilung:Fakultät für Mathematik, Informatik und Statistik
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Raumfahrt
HGF - Programmthema:Technik für Raumfahrtsysteme
DLR - Schwerpunkt:Raumfahrt
DLR - Forschungsgebiet:R SY - Technik für Raumfahrtsysteme
DLR - Teilgebiet (Projekt, Vorhaben):R - Intelligente Analysen und Methoden zur sicheren Softwareentwicklung
Standort: Jena
Institute & Einrichtungen:Institut für Datenwissenschaften > Datengewinnung und -mobilisierung
Hinterlegt von: Brust, Dr. Clemens-Alexander
Hinterlegt am:22 Dez 2023 08:26
Letzte Änderung:03 Jan 2024 13:25

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.