Changes

Jump to: navigation, search

CFish

120 bytes added, 21:40, 24 August 2020
no edit summary
=AVX2 Attacks=
Since May 2020, CFish contains experimental [[AVX2]]/[[AVX-512]] computational [[Sliding Piece Attacks|sliding piece attack]] code by Okuhara
as memory saving alternative to [[Magic Bitboards|Magic bitboards]] <ref>[https://github.com/syzygy1/Cfish/blob/master/src/avx2-bitboard.h Cfish/avx2-bitboard.h at master · syzygy1/Cfish · GitHub]</ref>. It applies a kind of [[Classical Approach#Branchless|branchless classical approach]].
For instance, the four [[Classical Approach#Positive Rays|positive rays]] and [[Classical Approach#Negative Rays|negative rays]] of a [[Queen|queen]]
are processed as vector of 4 [[Bitboards|bitboards]] in one 256-bit ymm register each. Positive and negative rays were intersected with the vector of broadcast [[Occupancy|occupancies]],
While the positive rays were processed by [[BMI1#BLSMSK|BLSMSK]] aka <code>((x-1) ^ x)</code> to clear the ray squares above the LS1B blockers,
the negative rays use a [[Parallel Prefix Algorithms#Fill Stuff|parallel prefix fill]] with three vector right shifts and ors, to clear all ray bits below the MS1B blockers.
The eight ray attacks sets were vertically and two times horizontally ored together for the final result.
The conditional compiled AVX-512 version takes advantage of the _mm256_lzcnt_epi64 <ref>[https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_lzcnt_epi64&expand=5560,5471,3497 _mm256_lzcnt_epi64]</ref> and _mm256_ternarylogic_epi64 <ref>[https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_ternarylogic_epi64&expand=5560,5471,3497,5873 _mm256_ternarylogic_epi64]</ref> intrinsics.
Rook and bishop naturally suffer from less vector utilization, and combine some other well known techniques, i.e. the bishop attack getter processes only positive rays by swapping bytes.

Navigation menu