Kreutzer, Moritz and Hager, Georg and Wellein, Gerhard and Fehske, Holger and Basermann, Achim and Bishop, Alan R. (2012) Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation. In: Bisher bei IEEE Xplore (online, URL s.u.); Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops (IPDPS 2012), 1696 -1702. IEEE Conference Publications. Workshop on Large-Scale Parallel Processing to be held at the IEEE International Parallel and Distributed Processing Symposium 2012, 2012-05-21 - 2012-05-25, Shanghai, China. doi: 10.1109/IPDPSW.2012.211. ISBN 978-1-4673-0974-5.
![]()
|
PDF
153kB |
Official URL: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6270844
Abstract
Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia “Fermi” class of GPGPUs. A new “padded jagged diagonals storage” (pJDS) format is proposed which may substantially reduce the memory overhead intrinsic to the widespread ELLPACK-R scheme while making no assumptions about the matrix structure. In our test scenarios the pJDS format cuts the overall spMVM memory footprint on the GPGPU by up to 70%, and achieves 91% to 130% of the ELLPACK-R performance. Using a suitable performance model we identify performance bottlenecks on the node level that invalidate some types of matrix structures for efficient multi-GPGPU parallelization. For appropriate sparsity patterns we extend previous work on distributed-memory parallel spMVM to demonstrate a scalable hybrid MPI-GPGPU code, achieving efficient overlap of communication and computation.
Item URL in elib: | https://elib.dlr.de/75140/ | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Document Type: | Conference or Workshop Item (Speech) | ||||||||||||||||||||||||||||
Title: | Sparse matrix-vector multiplication on GPGPU clusters: A new storage format and a scalable implementation | ||||||||||||||||||||||||||||
Authors: |
| ||||||||||||||||||||||||||||
Date: | 2012 | ||||||||||||||||||||||||||||
Journal or Publication Title: | Bisher bei IEEE Xplore (online, URL s.u.); Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops (IPDPS 2012) | ||||||||||||||||||||||||||||
Refereed publication: | Yes | ||||||||||||||||||||||||||||
Open Access: | Yes | ||||||||||||||||||||||||||||
Gold Open Access: | No | ||||||||||||||||||||||||||||
In SCOPUS: | No | ||||||||||||||||||||||||||||
In ISI Web of Science: | No | ||||||||||||||||||||||||||||
DOI: | 10.1109/IPDPSW.2012.211 | ||||||||||||||||||||||||||||
Page Range: | 1696 -1702 | ||||||||||||||||||||||||||||
Publisher: | IEEE Conference Publications | ||||||||||||||||||||||||||||
ISBN: | 978-1-4673-0974-5 | ||||||||||||||||||||||||||||
Status: | Published | ||||||||||||||||||||||||||||
Keywords: | Parallel sparse matrix-vector multiplication, multi-core processors, GPGPUs, new storage formats | ||||||||||||||||||||||||||||
Event Title: | Workshop on Large-Scale Parallel Processing to be held at the IEEE International Parallel and Distributed Processing Symposium 2012 | ||||||||||||||||||||||||||||
Event Location: | Shanghai, China | ||||||||||||||||||||||||||||
Event Type: | international Conference | ||||||||||||||||||||||||||||
Event Start Date: | 21 May 2012 | ||||||||||||||||||||||||||||
Event End Date: | 25 May 2012 | ||||||||||||||||||||||||||||
Organizer: | IEEE | ||||||||||||||||||||||||||||
HGF - Research field: | Aeronautics, Space and Transport (old) | ||||||||||||||||||||||||||||
HGF - Program: | Space (old) | ||||||||||||||||||||||||||||
HGF - Program Themes: | W SY - Technik für Raumfahrtsysteme | ||||||||||||||||||||||||||||
DLR - Research area: | Space | ||||||||||||||||||||||||||||
DLR - Program: | W SY - Technik für Raumfahrtsysteme | ||||||||||||||||||||||||||||
DLR - Research theme (Project): | W - Vorhaben SISTEC (old) | ||||||||||||||||||||||||||||
Location: | Köln-Porz | ||||||||||||||||||||||||||||
Institutes and Institutions: | Institut of Simulation and Software Technology Institut of Simulation and Software Technology > Distributed Systems and Component Software | ||||||||||||||||||||||||||||
Deposited By: | Basermann, Dr.-Ing. Achim | ||||||||||||||||||||||||||||
Deposited On: | 01 Mar 2012 08:08 | ||||||||||||||||||||||||||||
Last Modified: | 24 Apr 2024 19:41 |
Repository Staff Only: item control page