Changes

Jump to: navigation, search

CFish

42 bytes added, 10:30, 25 August 2020
no edit summary
the negative rays use a [[Parallel Prefix Algorithms#Fill Stuff|parallel prefix fill]] with three vector right shifts and ors, to clear all ray bits below the MS1B blockers.
The eight ray attack sets were vertically and two times horizontally ored together for the final result.
The conditional compiled AVX-512 version takes advantage of the [[AVX-512#VPLZCNT|_mm256_lzcnt_epi64 ]] <ref>[https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_lzcnt_epi64&expand=5560,5471,3497 _mm256_lzcnt_epi64]</ref> and [[AVX-512#VPTERNLOG|_mm256_ternarylogic_epi64 ]] <ref>[https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_ternarylogic_epi64&expand=5560,5471,3497,5873 _mm256_ternarylogic_epi64]</ref> intrinsics.
Rook and bishop naturally suffer from less vector utilization, and combine some other well known techniques, i.e. the bishop attack getter processes only positive rays by swapping bytes.

Navigation menu