Why Program Slows When Looping Over 8192 Elements
Consider the following code snippet:
#define SIZE 8192 float img[SIZE][SIZE]; // input image float res[SIZE][SIZE]; // result of mean filter int main() { // Initialization for (int i = 0; i < SIZE; i++) { for (int j = 0; j < SIZE; j++) { img[j][i] = (2 * j + i) % 8196; } } // Matrix processing - applying mean filter for (int i = 1; i < SIZE - 1; i++) { for (int j = 1; j < SIZE - 1; j++) { res[j][i] = 0; for (int k = -1; k < 2; k++) { for (int l = -1; l < 2; l++) { res[j][i] += img[j + l][i + k]; } } res[j][i] /= 9; } } }
This code exhibits performance variability depending on the value of SIZE, as seen by its execution times:
Understanding the Issue
The difference in execution times can be attributed to a known issue known as super-alignment:
Memory Management
Malloc/free is not directly responsible for the performance difference.
Outer Loop Ordering
Another key issue in this code is the order of the outer loops. The original code iterates the matrix column-wise, while row-wise iteration is more efficient for memory access.
Solution
To alleviate the performance issue, the outer loops should be interchanged:
for (int j = 1; j < SIZE - 1; j++) { for (int i = 1; i < SIZE - 1; i++) { res[j][i] = 0; for (int k = -1; k < 2; k++) { for (int l = -1; l < 2; l++) { res[j][i] += img[j + l][i + k]; } } res[j][i] /= 9; } }
Performance Improvement
After interchanging the outer loops, the performance significantly improves:
The above is the detailed content of Why Does Looping Over an 8192-Element Array Suddenly Slow Down My Program?. For more information, please follow other related articles on the PHP Chinese website!