Moebius, Max (2024) Active Learning for Cyber Attack Detection on Unlabeled URLs. Masterarbeit, Friedrich Schiller University Jena.
Dies ist die aktuellste Version dieses Eintrags.
PDF
2MB |
Kurzfassung
Machine learning techniques require a large amount of labeled data for training purposes. However, existing labeled datasets are outdated, and cyber-attacks are becoming more complex and sophisticated. This increases the need to use active learning techniques that constantly label new unknown data patterns, which might contain signatures of zero-day attacks (unknown new attacks with no preliminary signatures). Various techniques can accomplish this task, including blacklisting, rule-based, or machine-learning approaches. This thesis focuses on exploring the essential steps of a machine-learning approach, particularly in the domain of Active Learning. The Data Preprocessing step combines various approaches from related work. Seven methods are compared for Feature Extraction, ranging from heuristic approaches to Natural Language Processing. K-fold cross-validation is used to validate the extracted features and the selected Classifier. The machine learning classifier Random Forest and Stochastic Gradient Descent are utilized. Incremental Active Learning and Lifelong learning are utilized, and four different approaches are employed, including pool-based and stream-based sampling, Query-by-Committee, and Cluster Sampling. The thesis mainly employs probabilistic query strategies. Experimental results of different approaches are presented and discussed in their usability with Active Learning. Finally, future work is presented to highlight any potential avenues for further research.
elib-URL des Eintrags: | https://elib.dlr.de/202848/ | ||||||||
---|---|---|---|---|---|---|---|---|---|
Dokumentart: | Hochschulschrift (Masterarbeit) | ||||||||
Titel: | Active Learning for Cyber Attack Detection on Unlabeled URLs | ||||||||
Autoren: |
| ||||||||
Datum: | 31 Januar 2024 | ||||||||
Erschienen in: | Active Learning for Cyber Attack Detection on Unlabeled URLs | ||||||||
Open Access: | Ja | ||||||||
Seitenanzahl: | 102 | ||||||||
Status: | veröffentlicht | ||||||||
Stichwörter: | Machine learning, active learning, Malicious URL, Incremental learning | ||||||||
Institution: | Friedrich Schiller University Jena | ||||||||
Abteilung: | Computer Vision Group, Faculty of Mathematics and Computer Science | ||||||||
HGF - Forschungsbereich: | keine Zuordnung | ||||||||
HGF - Programm: | keine Zuordnung | ||||||||
HGF - Programmthema: | keine Zuordnung | ||||||||
DLR - Schwerpunkt: | Digitalisierung | ||||||||
DLR - Forschungsgebiet: | D KIZ - Künstliche Intelligenz | ||||||||
DLR - Teilgebiet (Projekt, Vorhaben): | D - CausalAnomalies | ||||||||
Standort: | Jena | ||||||||
Institute & Einrichtungen: | Institut für Datenwissenschaften | ||||||||
Hinterlegt von: | Bouhlal, Badr-Eddine | ||||||||
Hinterlegt am: | 21 Feb 2024 10:52 | ||||||||
Letzte Änderung: | 26 Feb 2024 12:14 |
Verfügbare Versionen dieses Eintrags
-
Active Learning for Cyber Attack Detection on Unlabeled URLs. (deposited NICHT SPEZIFIZIERT)
- Active Learning for Cyber Attack Detection on Unlabeled URLs. (deposited 21 Feb 2024 10:52) [Gegenwärtig angezeigt]
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags