elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Kontakt | English
Schriftgröße: [-] Text [+]

A Performance Comparison between GPU Frameworks on MultiSAR

Ehrlich, Alexander (2023) A Performance Comparison between GPU Frameworks on MultiSAR. Masterarbeit, Universität Würzburg.

[img] PDF - Nur DLR-intern zugänglich
9MB

Kurzfassung

In the field of high performance computing, GPUs play an important role. However, in order to be able to use them one must choose an API that provides general purpose processing functionality. The choice itself and optimal usage of such APIs are not trivial tasks. In this thesis, the programming models CUDA, OpenCL, OpenACC and SYCL are compared to each other. CUDA and OpenCL are examples of low-level APIs, while OpenACC and SYCL are considered higher level. All of them are compared not only in terms of runtime, but also memory usage, accuracy of the results as well as portability. The comparison includes multiple memory allocation types, if supported by the respective API. These configurations are tested with microbenchmarks as well, but the main application is MultiSAR, which is a program written in C++ and used by the German Aerospace Center (DLR) for processing radar data. Since it currently runs on only a single CPU thread, additional changes to the original code are made, to enable a more efficient use of a GPU’s ressources. These changes are not limited to the runtime, but also the compilation via CMake required changes. Due to varying support of certain C++ features or libraries, the specific implementations using these APIs vary as well, making this not a fully direct and fair comparison. Evaluations show a runtime improvement of more than 40x in certain configurations over the original runtime. Compared to each other, kernel execution times between CUDA, OpenACC and SYCL scored similarly, SYCL finding the fastest configuration here by a slight margin. Measuring total execution time, OpenACC’s version is the best by a wide margin, likely because of further optimizations in functions not touched with the other programming models. That comes at the cost of accuracy, which can grow up to the second digit after the decimal point in the worst case. The implementation with OpenACC performs particularly poorly here, as it introduced additional errors. Difficulties surrounding OpenCL lead to the conclusion of this being an unsuitable API for the purposes of MultiSAR, at least with the test system used. Regarding memory allocation types, the fastest one is traditionally allocating and copying memory to the GPU. However, managed memory is the recommended startup choice, because of its better portability and only slightly worse runtime.

elib-URL des Eintrags:https://elib.dlr.de/201198/
Dokumentart:Hochschulschrift (Masterarbeit)
Titel:A Performance Comparison between GPU Frameworks on MultiSAR
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Ehrlich, AlexanderUniversität WürzburgNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Datum:10 November 2023
Referierte Publikation:Nein
Open Access:Nein
Gold Open Access:Nein
In SCOPUS:Nein
In ISI Web of Science:Nein
Seitenanzahl:110
Status:veröffentlicht
Stichwörter:GPU, GPU Frameworks, MultiSAR
Institution:Universität Würzburg
Abteilung:Institut für Informatik
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Raumfahrt
HGF - Programmthema:Erdbeobachtung
DLR - Schwerpunkt:Raumfahrt
DLR - Forschungsgebiet:R EO - Erdbeobachtung
DLR - Teilgebiet (Projekt, Vorhaben):R - Fernerkundung u. Geoforschung
Standort: Oberpfaffenhofen
Institute & Einrichtungen:Deutsches Fernerkundungsdatenzentrum > Dynamik der Landoberfläche
Hinterlegt von: Huber, Martin
Hinterlegt am:11 Jan 2024 09:19
Letzte Änderung:11 Jan 2024 09:19

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.