A peculiar observation in transposing matrices arose after conducting experiments on matrices of varying sizes: transposing a matrix with dimensions 2^n is computationally more expensive than transposing one with dimensions 2^n 1. The discrepancy becomes significant when n equals 512.
The code provided for the transposition operation is as follows:
#define SAMPLES 1000 #define MATSIZE 512 #include <time.h> #include <iostream> int mat[MATSIZE][MATSIZE]; void transpose() { for ( int i = 0 ; i < MATSIZE ; i++ ) for ( int j = 0 ; j < MATSIZE ; j++ ) { int aux = mat[i][j]; mat[i][j] = mat[j][i]; mat[j][i] = aux; } } int main() { //initialize matrix for ( int i = 0 ; i < MATSIZE ; i++ ) for ( int j = 0 ; j < MATSIZE ; j++ ) mat[i][j] = i+j; int t = clock(); for ( int i = 0 ; i < SAMPLES ; i++ ) transpose(); int elapsed = clock() - t; std::cout << "Average for a matrix of " << MATSIZE << ": " << elapsed / SAMPLES; }
By altering the MATSIZE macro, the size of the matrix can be modified. The following benchmarks illustrate the stark difference:
The reason behind this anomaly lies in cache behavior and the concept of cache contention. Here's a breakdown:
Thus, the transposing operation becomes significantly slower for matrices with dimensions that are multiples of 2^n due to cache contention.
The above is the detailed content of Why is Transposing a 512x512 Matrix Significantly Slower Than Transposing a 513x513 Matrix?. For more information, please follow other related articles on the PHP Chinese website!