Matrix-matrix multiplication on graphics processing unit platform using tiling technique

Rahman Ghasempour Balagafshe; Alireza Akoushideh; Asadollah Shahbahrami

doi:10.11591/ijeecs.v28.i2.pp1012-1019

Matrix-matrix multiplication on graphics processing unit platform using tiling technique

Rahman Ghasempour Balagafshe, Alireza Akoushideh, Asadollah Shahbahrami

Abstract

Today’s hardware platforms have parallel processing capabilities and many parallel programming models have been developed. It is necessary to research an efficient implementation of compute-intensive applications using available platforms. Dense matrix-matrix multiplication is an important kernel that is used in many applications, while it is computationally intensive, especially for large matrix sizes. To improve the performance of this kernel, we implement it on the graphics processing unit (GPU) platform using the tiling technique with different tile sizes. Our experimental results show the tiling approach improves speed by 56.89% (2.32× faster) against straightforward (STF). And tile size of 32 has the highest speed compared to other tile sizes of 8 and 16.

Keywords

Dense; Matrix-matrix multiplication CUDA; Shared memory; Tiling

Full Text:

PDF

DOI: http://doi.org/10.11591/ijeecs.v28.i2.pp1012-1019

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

IJEECS visitor statistics

Username
Password
Remember me