Adversarial Examples and Robust Training of Deep Neural Networks for Image Classification

von der Lehr, Fabrice (2021) Adversarial Examples and Robust Training of Deep Neural Networks for Image Classification. Bachelorarbeit, DHBW Mannheim.

PDF
8MB

Kurzfassung

Deep neural networks (DNNs) have become a powerful tool for image classification tasks in recent years, being nowadays also relevant for safety-critical applications like autonomous driving. Despite being highly accurate even for unknown images, the existence of so-called "adversarial examples" nevertheless calls the robustness of DNNs into question: These are slightly, but purposefully perturbed versions of natural images, being only barely distinguishable from their unperturbed originals, but causing the DNN to misclassify them. In the scope of this work, two white-box attacks (Fast Gradient Sign Method, Projected Gradient Descent) and a black-box attack (Boundary Attack) were implemented to create the adversarial examples on the basis of images from the CIFAR-10 and the GTSRB datasets. The trained DNNs, being based on the PreAct-ResNet-50 architecture, were subsequently evaluated concerning their robustness against both adversarial and random perturbations. Furthermore, two variants of adversarial training (using the Fast Gradient Sign Method and the Stable Single Step algorithm, respectively) were implemented to analyze, in how far such an adaption of the training process influences the robustness and general accuracy of DNNs. Last, the loss landscapes of the differently trained DNNs were investigated qualitatively. The results show that the susceptibility to adversarial examples is highly data dependent, with images from CIFAR-10 generally exhibiting a higher risk than those from the GTSRB dataset. By contrast, random perturbations comparatively rarely led to misclassifications, regardless of the dataset considered. Moreover, Stable Single Step-based adversarial training has proven to increase the robustness against adversarial examples to a limited extent, but also slightly lower the accuracy for natural images. In general, however, adversarial training led to insufficient robustness enhancements, for which substantial overfitting of the trained DNNs was identified as the main reason.

elib-URL des Eintrags:

https://elib.dlr.de/144468/

Dokumentart:

Hochschulschrift (Bachelorarbeit)

Titel:

Adversarial Examples and Robust Training of Deep Neural Networks for Image Classification

Autoren:

Autoren	Institution oder E-Mail-Adresse	Autoren-ORCID-iD	ORCID Put Code
von der Lehr, Fabrice	Fabrice.Lehr (at) dlr.de	https://orcid.org/0009-0000-2134-6754	NICHT SPEZIFIZIERT

Datum:

7 September 2021

Referierte Publikation:

Nein

Open Access:

Seitenanzahl:

160

Status:

veröffentlicht

Stichwörter:

Machine Learning, Deep Learning, Image Classification, Robustness, Adversarial Examples

Institution:

DHBW Mannheim

Abteilung:

Fakultät Informatik

HGF - Forschungsbereich:

keine Zuordnung

HGF - Programm:

keine Zuordnung

HGF - Programmthema:

keine Zuordnung

DLR - Schwerpunkt:

keine Zuordnung

DLR - Forschungsgebiet:

keine Zuordnung

DLR - Teilgebiet (Projekt, Vorhaben):

keine Zuordnung

Standort:

Köln-Porz

Institute & Einrichtungen:

Institut für Simulations- und Softwaretechnik > High Performance Computing
Institut für Softwaretechnologie

Hinterlegt von:

von der Lehr, Fabrice

Hinterlegt am:

07 Dez 2021 10:11

Letzte Änderung:

07 Dez 2021 10:11

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags