Home > Backend Development > C++ > Why Does Looping Over an 8192-Element Array Suddenly Slow Down My Program?

Why Does Looping Over an 8192-Element Array Suddenly Slow Down My Program?

DDD
Release: 2024-12-17 04:47:25
Original
259 people have browsed it

Why Does Looping Over an 8192-Element Array Suddenly Slow Down My Program?

Why Program Slows When Looping Over 8192 Elements

Consider the following code snippet:

#define SIZE 8192
float img[SIZE][SIZE]; // input image
float res[SIZE][SIZE]; // result of mean filter

int main() {
    // Initialization
    for (int i = 0; i < SIZE; i++) {
        for (int j = 0; j < SIZE; j++) {
            img[j][i] = (2 * j + i) % 8196;
        }
    }

    // Matrix processing - applying mean filter
    for (int i = 1; i < SIZE - 1; i++) {
        for (int j = 1; j < SIZE - 1; j++) {
            res[j][i] = 0;
            for (int k = -1; k < 2; k++) {
                for (int l = -1; l < 2; l++) {
                    res[j][i] += img[j + l][i + k];
                }
            }
            res[j][i] /= 9;
        }
    }
}
Copy after login

This code exhibits performance variability depending on the value of SIZE, as seen by its execution times:

  • SIZE = 8191: 3.44 secs
  • SIZE = 8192: 7.20 secs
  • SIZE = 8193: 3.18 secs

Understanding the Issue

The difference in execution times can be attributed to a known issue known as super-alignment:

  • When SIZE is a multiple of 2048 (i.e., 8192 in this case), accessing elements in specific patterns triggers a less efficient memory layout.

Memory Management

Malloc/free is not directly responsible for the performance difference.

Outer Loop Ordering

Another key issue in this code is the order of the outer loops. The original code iterates the matrix column-wise, while row-wise iteration is more efficient for memory access.

Solution

To alleviate the performance issue, the outer loops should be interchanged:

    for (int j = 1; j < SIZE - 1; j++) {
        for (int i = 1; i < SIZE - 1; i++) {
            res[j][i] = 0;
            for (int k = -1; k < 2; k++) {
                for (int l = -1; l < 2; l++) {
                    res[j][i] += img[j + l][i + k];
                }
            }
            res[j][i] /= 9;
        }
    }
Copy after login

Performance Improvement

After interchanging the outer loops, the performance significantly improves:

  • SIZE = 8191: 0.376 seconds
  • SIZE = 8192: 0.357 seconds
  • SIZE = 8193: 0.351 seconds

The above is the detailed content of Why Does Looping Over an 8192-Element Array Suddenly Slow Down My Program?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template