Kreutzer, Moritz und Hager, Georg und Wellein, Gerhard und Fehske, Holger und Basermann, Achim und Bishop, Alan R. (2012) Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation. In: Bisher bei IEEE Xplore (online, URL s.u.); Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops (IPDPS 2012), 1696 -1702. IEEE Conference Publications. Workshop on Large-Scale Parallel Processing to be held at the IEEE International Parallel and Distributed Processing Symposium 2012, 2012-05-21 - 2012-05-25, Shanghai, China. doi: 10.1109/IPDPSW.2012.211. ISBN 978-1-4673-0974-5.
|
PDF
153kB |
Offizielle URL: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6270844
Kurzfassung
Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia “Fermi” class of GPGPUs. A new “padded jagged diagonals storage” (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme while making no assumptions about the matrix structure. In our test scenarios the pJDS format cuts the overall spMVM memory footprint on the GPGPU by up to 70%, and achieves 91% to 130% of the ELLPACK-R performance. Using a suitable performance model we identify performance bottlenecks on the node level that invalidate some types of matrix structures for efficient multi-GPGPU parallelization. For appropriate sparsity patterns we extend previous work on distributed-memory parallel spMVM to demonstrate a scalable hybrid MPI-GPGPU code, achieving efficient overlap of communication and computation.
elib-URL des Eintrags: | https://elib.dlr.de/75140/ | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Dokumentart: | Konferenzbeitrag (Vortrag) | ||||||||||||||||||||||||||||
Titel: | Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation | ||||||||||||||||||||||||||||
Autoren: |
| ||||||||||||||||||||||||||||
Datum: | 2012 | ||||||||||||||||||||||||||||
Erschienen in: | Bisher bei IEEE Xplore (online, URL s.u.); Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops (IPDPS 2012) | ||||||||||||||||||||||||||||
Referierte Publikation: | Ja | ||||||||||||||||||||||||||||
Open Access: | Ja | ||||||||||||||||||||||||||||
Gold Open Access: | Nein | ||||||||||||||||||||||||||||
In SCOPUS: | Nein | ||||||||||||||||||||||||||||
In ISI Web of Science: | Nein | ||||||||||||||||||||||||||||
DOI: | 10.1109/IPDPSW.2012.211 | ||||||||||||||||||||||||||||
Seitenbereich: | 1696 -1702 | ||||||||||||||||||||||||||||
Verlag: | IEEE Conference Publications | ||||||||||||||||||||||||||||
ISBN: | 978-1-4673-0974-5 | ||||||||||||||||||||||||||||
Status: | veröffentlicht | ||||||||||||||||||||||||||||
Stichwörter: | Parallel sparse matrix-vector multiplication, multi-core processors, GPGPUs, new storage formats | ||||||||||||||||||||||||||||
Veranstaltungstitel: | Workshop on Large-Scale Parallel Processing to be held at the IEEE International Parallel and Distributed Processing Symposium 2012 | ||||||||||||||||||||||||||||
Veranstaltungsort: | Shanghai, China | ||||||||||||||||||||||||||||
Veranstaltungsart: | internationale Konferenz | ||||||||||||||||||||||||||||
Veranstaltungsbeginn: | 21 Mai 2012 | ||||||||||||||||||||||||||||
Veranstaltungsende: | 25 Mai 2012 | ||||||||||||||||||||||||||||
Veranstalter : | IEEE | ||||||||||||||||||||||||||||
HGF - Forschungsbereich: | Verkehr und Weltraum (alt) | ||||||||||||||||||||||||||||
HGF - Programm: | Weltraum (alt) | ||||||||||||||||||||||||||||
HGF - Programmthema: | W SY - Technik für Raumfahrtsysteme | ||||||||||||||||||||||||||||
DLR - Schwerpunkt: | Weltraum | ||||||||||||||||||||||||||||
DLR - Forschungsgebiet: | W SY - Technik für Raumfahrtsysteme | ||||||||||||||||||||||||||||
DLR - Teilgebiet (Projekt, Vorhaben): | W - Vorhaben SISTEC (alt) | ||||||||||||||||||||||||||||
Standort: | Köln-Porz | ||||||||||||||||||||||||||||
Institute & Einrichtungen: | Institut für Simulations- und Softwaretechnik Institut für Simulations- und Softwaretechnik > Verteilte Systeme und Komponentensoftware | ||||||||||||||||||||||||||||
Hinterlegt von: | Basermann, Dr.-Ing. Achim | ||||||||||||||||||||||||||||
Hinterlegt am: | 01 Mär 2012 08:08 | ||||||||||||||||||||||||||||
Letzte Änderung: | 24 Apr 2024 19:41 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags