Home > Backend Development > C++ > How Can AVX2 and BMI2 Instructions Optimize Left Packing Based on a Mask?

How Can AVX2 and BMI2 Instructions Optimize Left Packing Based on a Mask?

Barbara Streisand
Release: 2024-12-30 13:45:13
Original
447 people have browsed it

How Can AVX2 and BMI2 Instructions Optimize Left Packing Based on a Mask?

Using AVX2 and BMI2 for Efficient Left Packing Based on a Mask

In AVX2, we can leverage the vpermps (_mm256_permutevar8x32_ps) instruction to perform lane-crossing variable-shuffles. Additionally, BMI2 provides us with pext (Parallel Bits Extract), enabling us to perform bitwise extraction operations crucial for our problem.

Algorithm:

  1. Begin with a constant holding packed 3-bit indices ([7 6 5 4 3 2 1 0]).
  2. Extract desired indices into a contiguous sequence using pext.
  3. Generate a mask where each bit corresponds to an index byte (unpack indices one per byte).
  4. Replicate each bit in the mask to fill its corresponding byte.
  5. Extract wanted indices from the identity shuffle using pext.
  6. Convert the index bytes to a 32-bit integer.
  7. Use vpermps to perform the shuffle based on the 32-bit index vector.

Code Implementation:

#include <stdint.h>
#include <immintrin.h>

__m256 compress256(__m256 src, unsigned int mask)
{
  uint64_t expanded_mask = _pdep_u64(mask, 0x0101010101010101);
  expanded_mask *= 0xFF;
  const uint64_t identity_indices = 0x0706050403020100;
  uint64_t wanted_indices = _pext_u64(identity_indices, expanded_mask);

  __m128i bytevec = _mm_cvtsi64_si128(wanted_indices);
  __m256i shufmask = _mm256_cvtepu8_epi32(bytevec);

  return _mm256_permutevar8x32_ps(src, shufmask);
}
Copy after login

Advantages:

  • Uses immediate constants and avoids memory loads.
  • Simplicity and efficiency.

Drawbacks:

  • May be slower on AMD CPUs before Zen 3 due to slow pdep/pext performance.

The above is the detailed content of How Can AVX2 and BMI2 Instructions Optimize Left Packing Based on a Mask?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template