Why does adding 0.1f to a float array slow down performance by 10x compared to adding 0?
The performance difference arises from the handling of denormal (or subnormal) floating-point numbers by processors. Denormal numbers represent values very close to zero, which can significantly impact performance.
When you add 0.1f to a float array, the result can be a denormal number, even though the original values were not. This is because of the floating-point representation's limited precision. Operations on denormal numbers are typically much slower than on normalized numbers because many processors can't handle them directly and must resolve them using microcode.
In contrast, adding 0 to a float array does not produce denormal numbers. This is because 0 is already a normalized number. Therefore, operations involving 0 can be performed much more efficiently.
To demonstrate the performance impact of denormal numbers, consider the following code:
const float x[16] = { 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6}; const float z[16] = {1.123, 1.234, 1.345, 156.467, 1.578, 1.689, 1.790, 1.812, 1.923, 2.034, 2.145, 2.256, 2.367, 2.478, 2.589, 2.690}; float y[16]; for (int i = 0; i < 16; i++) { y[i] = x[i]; } for (int j = 0; j < 9000000; j++) { for (int i = 0; i < 16; i++) { y[i] *= x[i]; y[i] /= z[i]; y[i] = y[i] + 0.1f; // <-- y[i] = y[i] - 0.1f; // <-- } }
Here, adding 0.1f to the float array results in a significant slow down because the resulting values are converted to denormal numbers.
To avoid the performance impact of denormal numbers, you can use the _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); intrinsic to flush denormals to zero. This means that any value that would have been denormal is instead rounded to zero. By using this intrinsic, you can significantly improve the performance of your code when working with floating-point arrays.
The above is the detailed content of Why is adding 0.1f to a float array significantly slower than adding 0, and how can this performance issue be addressed?. For more information, please follow other related articles on the PHP Chinese website!