Changes

Jump to: navigation, search

AVX-512

1,133 bytes added, 10:42, 25 August 2020
no edit summary
<span id="VPOPCNT"></span>
==VPOPCNT==
The future AVX-512VPOPCNTDQ extension has a vector [[Population Count|population count]] instruction to count one bits of either 16 32-bit double words (VPOPCNTD) or 8 64-bit quad words aka bitboards (VPOPCNTQ) in parallel <ref>[https://github.com/WojciechMula/sse-popcount/blob/master/popcnt-avx512-harley-seal.cpp sse-popcount/popcnt-avx512-harley-seal.cpp at master · WojciechMula/sse-popcount · GitHub]</ref> <ref>[[Wojciech Muła]], [http://dblp.uni-trier.de/pers/hd/k/Kurz:Nathan Nathan Kurz], [https://github.com/lemire Daniel Lemire] ('''2016'''). ''Faster Population Counts Using AVX2 Instructions''. [https://arxiv.org/abs/1611.07612 arXiv:1611.07612]</ref> <ref>[https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=VPOPCNTD&expand=4368 Intel® Intrinsics Guide VPOPCNTD]</ref>. <pre>__m128i _mm_mask_popcnt_epi32(__m128i src, __mmask8 k, __m128i a);__m128i _mm_maskz_popcnt_epi32(__mmask8 k, __m128i a);__m128i _mm_popcnt_epi3 (__m128i a);__m256i _mm256_mask_popcnt_epi32(__m256i src, __mmask8 k, __m256i a);__m256i _mm256_maskz_popcnt_epi32(__mmask8 k, __m256i a);__m256i _mm256_popcnt_epi32(__m256i a);__m512i _mm512_mask_popcnt_epi32(__m512i src, __mmask16 k, __m512i a);__m512i _mm512_maskz_popcnt_epi32(__mmask16 k, __m512i a);__m512i _mm512_popcnt_epi32(__m512i a); __m128i _mm_mask_popcnt_epi64(__m128i src, __mmask8 k, __m128i a);__m128i _mm_maskz_popcnt_epi64(__mmask8 k, __m128i a);__m128i _mm_popcnt_epi64(__m128i a);__m256i _mm256_mask_popcnt_epi64(__m256i src, __mmask8 k, __m256i a);__m256i _mm256_maskz_popcnt_epi64(__mmask8 k, __m256i a);__m256i _mm256_popcnt_epi64(__m256i a);__m512i _mm512_mask_popcnt_epi64(__m512i src, __mmask8 k, __m512i a);__m512i _mm512_maskz_popcnt_epi64(__mmask8 k, __m512i a);__m512i _mm512_popcnt_epi64(__m512i a)</pre>
=See also=

Navigation menu