Enhancing the Matrix Transpose Operation Using Intel AVX Instruction Set Extension

Ahmed Sherif Zekri

Enhancing the Matrix Transpose Operation Using Intel AVX Instruction Set Extension

Ahmed Sherif Zekri

Affiliations
1 Department of Mathematics and Computer Science, Alexandria University, Egypt

Abstract
References
Article Metrics
Refbacks

General-purpose microprocessors are augmented with short-vector instruction extensions in order to simultaneously process more than one data element using the same operation. This type of parallelism is known as data-parallel processing. Many scientific, engineering, and signal processing applications can be formulated as matrix operations. Therefore, accelerating these kernel operations on microprocessors, which are the building blocks or large high-performance computing systems, will definitely boost the performance of the aforementioned applications. In this paper, we consider the acceleration of the matrix transpose operation using the 256-bit Intel advanced vector extension (AVX) instructions. We present a novel vector-based matrix transpose algorithm and its optimized implementation using AVX instructions. The experimental results on Intel Core i7 processor demonstrates a 2.83 speedup over the standard sequential implementation, and a maximum of 1.53 speedup over the GCC library implementation. When the transpose is combined with matrix addition to compute the matrix update, B + A^T, where A and B are squared matrices, the speedup of our implementation over the sequential algorithm increased to 3.19.

Keywords

Matrix Transpose, Vector Instructions, Streaming and Advanced Vector Extensions, Data-Parallel Computations.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

Abstract Views: 394

PDF Views: 159

AIRCC's International Journal of Computer Science and Information Technology

Enhancing the Matrix Transpose Operation Using Intel AVX Instruction Set Extension

Keywords

Enhancing the Matrix Transpose Operation Using Intel AVX Instruction Set Extension

Authors

Abstract

Keywords

Username
Password
Remember me

Username
Password
Remember me