Changes

Jump to: navigation, search

Stockfish NNUE

635 bytes added, 12:40, 25 October 2020
no edit summary
=NNUE Structure=
The [[Neural Networks|neural network]] consists of four layers. The input layer is heavily overparametrized, feeding in the [[Board Representation|board representation]] for all king placements per side<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=1 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 23, 2020</ref>.  The so called '''HalfKP''' structure consists of two halves covering input layer and first hidden layer, each half associated to one of the two [[King|kings]].For each either black or white king placement, the 10 none king pieces on their particular squares are the boolean {0,1} inputs, along with a relict from Shogi piece drop (BONA_PIECE_ZERO), 64 x (64 x 10 + 1) = 41,024 inputs for each half, which are multiplied by a 16-bit integer weight vector for 256 outputs per half, in total, 256 x 41,024 = 10,503,144 weights.As emphasized by [[Ronald de Man]] in a [[CCC]] forum discussion <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506&start=7 Re: NNUE Question - King Placements] by [[Ronald de Man|syzygy]], [[CCC]], October 23, 2020</ref>, the input weights are arranged in such a way, that [[Color Flipping|color flipped]] king-piece configurations in both halves share the same index.However, and that seems also a relict from Shogi with its [https://en.wikipedia.org/wiki/Rotational_symmetry 180 degrees rotational] 9x9 board symmetry, instead of [[Vertical Flipping|vertical flipping]] (xor 56), [[Flipping Mirroring and Rotating#Rotationby180degrees|rotation]] was applied (xor 63) <ref>[https://github.com/official-stockfish/Stockfish/issues/3021 NNUE eval rotate vs mirror · Issue #3021 · official-stockfish/Stockfish · GitHub] by [[Terje Kirstihagen]], August 17, 2020</ref>. 
The efficiency of [[NNUE]] is due to [[Incremental Updates|incremental update]] of the input layer outputs in [[Make Move|make]] and [[Unmake Move|unmake move]],
where only a tiny fraction of its neurons need to be considered <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=1 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 23, 2020</ref>in case of none king moves. The remaining three layers with 2x256x32, 32x32 and 32x1 weights are computational less expensive, hidden layer 1 and 2 with layers apply a [https://en.wikipedia.org/wiki/Rectifier_(neural_networks) ReLu activation] <ref>[https://github.com/official-stockfish/Stockfish/blob/master/src/nnue/architectures/halfkp_256x2-32-32.h#L42 Stockfish/halfkp_256x2-32-32.h at master · official-stockfish/Stockfish · GitHub]</ref> <ref>[https://github.com/official-stockfish/Stockfish/blob/master/src/nnue/layers/clipped_relu.h#L82 Stockfish/clipped_relu.h at master · official-stockfish/Stockfish · GitHub]</ref>, best calculated using appropriate [[SIMD and SWAR Techniques|SIMD instructions]] performing fast [[Byte|8-bit]]/[[Word|16-bit]] integer vector arithmetic, like [[MMX]], [[SSE2]] or [[AVX2]] on [[x86]]/[[x86-64]], or, if available, [[AVX-512]].
[[FILE:StockfishNNUELayers.png|none|border|text-bottom|1024px]]
NNUE layers in action <ref>Image courtesy Roman Zhukov, revised version of the image posted in [http://talkchess.com/forum3/viewtopic.php?f=2&t=74059&start=139 Re: Stockfish NN release (NNUE)] by Roman Zhukov, [[CCC]], June 17, 2020, labels corrected October 23, 2020, see [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506&start=1 Re: NNUE Question - King Placements] by [[Andrew Grant]], [[CCC]], October 23, 2020</ref>
Explanation by [[Hisayori Noda|NodchipRonald de Man]] explained on , who did the Stockfish NNUE port to [[https://en.wikipedia.org/wiki/Discord_(software) DiscordCFish]] on June 2020 <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74780 Don't understand 75506&start=9 Re: NNUEQuestion - King Placements] by Lucasart[[Ronald de Man|syzygy]], [[CCC]], August 14October 23, 2020</ref>: The accumulator has a "white king" half and a "black king" half, where each half is a 256-element vector of 16-bit ints, which is equal to the sum of the weights of the "active" (pt, sq, ksq) features plus a 256-element vector of 16-bit biases.  The "transform" step of the NNUE evaluation forms a 512-element vector of 8-bit ints where the first half is formed from the 256-element vector of the side to move and the second half is formed from the 256-element vector of the other side. In this step the 16-bit elements are clipped/clamped to a value from 0 to 127. This is the output of the input layer.  This 512-element vector of 8-bit ints is then multiplied by a 32x512 matrix of 8-bit weights to get a 32-element vector of 32-bit ints, to which a vector of 32-bit biases is added. The sum vector is divided by 64 and clipped/clamped to a 32-element vector of 8-bit ints from 0 to 127. This is the output of the first hidden layer.  The resulting 32-element vector of 8-bit ints is multiplied by a 32x32 matrix of 8-bit weights to get a 32-element vector of 32-bit ints, to which another vector of 32-bit biases is added. These ints are again divided by 64 and clipped/clamped to 32 8-bit ints from 0 to 127. This is the output of the second hidden layer.  This 32-element vector of 8-bits ints is then multiplied by a 1x32 matrix of 8-bit weights (i.e. the inner product of two vectors is taken). This produces a 32-bit value to which a 32-bit bias is added. This gives the output of the output layer.
41,024 = 64 * 641. 64 comes from the number The output of the cells where king may exist. 641 output layer is divided by FV_SCALE = 64 * 5 * 2 + 116 to produce the NNUE evaluation. 64 here comes from the number of the cells where SF's evaluation then take some further steps such as adding a piece other than king may exist. 5 is Tempo bonus (even though the number of piece types other than king. 2 is NNUE evaluation inherently already takes into account the number of side to move in the colors, white and black. 1 is a captured piece. "+ 1" is BONA_PIECE_ZERO. Here "bona" means "[[Bonanza|bonanza]]transform" which is a popular computer [[Shogi|shogi]] engine. It introduced the feature "p" for the first time. BonaPieces are contained in the evalList. It is updated by Position::do_move(step) and Position::undo_movescaling the evaluation towards zero as rule50_count(), and used by NNUE to calculate the network parameters between the input layer and the first hidden layer. About the calculation, the following text will be helpful. This text is sent to RocketMiningPoo on Twitter. "We add the i-th COLUMN of the W{0} to the z{0} for each i, where the i-th element is set to 1. And we subtract the i-th COULMN of the W{0} from the z{0} for each i, where the i-th element is set to 0. This operation is "accumulate" in your question." may be right... I hope that someone will double checkapproaches 50 moves.
=Network=
Being attracted by new advantages as well as being encouraged by some impressive successes, many developers joined or continued to work. The [[#Source|Official Stockfish]] repository shows the numbers of commits, ideas increased significantly after merging NNUE.
 
=Rotation vs Flip=
Since the 9x9 [[Shogi]] board has a centered king file and [[Castling|castling]] is not known in Shogi, [[Color Flipping|color flip]] versus[[Flipping Mirroring and Rotating#Rotationby180degrees|180 degree rotate]] differs in a [[Horizontal Mirroring|horizontal mirrored]] position from the other side's point of view, with otherwise identical playing options. The NNUE is trained and probed from the side to move point of view, where the used 180 degree rotation (xor 63 instead of 56) to flip sides looks rather strange for chess <ref>[https://github.com/official-stockfish/Stockfish/blob/615d98da2447e79ceceae205e0cd4e878115acc3/src/types.h#L323 Stockfish/types.h at 615d98da2447e79ceceae205e0cd4e878115acc3 · official-stockfish/Stockfish · GitHub]</ref>. i.e color flipping the black king from e8 to d1 rather than e1. Does it consider castling short to the queen side?
 
It is a little bit unclear, how that rotation rather than flip affects the playing strength <ref>[https://github.com/official-stockfish/Stockfish/issues/3021 NNUE eval rotate vs mirror · Issue #3021 · official-stockfish/Stockfish · GitHub] by [[Terje Kirstihagen]], August 17, 2020</ref> and whether NNUE for chess suffers from [https://en.wikipedia.org/wiki/Associative_visual_agnosia associative visual agnosia].
Maybe Fishtest needs to play many games with color-flipped openings, i.e. 1.e4 e5 and 1.e3 e5 2.e4, to look whether results differ or not.
Anyway, a fix from rotate to flip has to be done from producer and consumer sides, and is likely to void some training sessions.
=Suggestions=

Navigation menu