https://www.chessprogramming.org/api.php?action=feedcontributions&user=Smatovic&feedformat=atomChessprogramming wiki - User contributions [en]2024-03-28T19:17:46ZUser contributionsMediaWiki 1.30.1https://www.chessprogramming.org/index.php?title=Perft&diff=26925Perft2024-03-16T03:07:36Z<p>Smatovic: /* 2020 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Board Representation]] * [[Move Generation]] * Perft'''<br />
<br />
'''Perft''', ('''perf'''ormance '''t'''est, move path enumeration)<br/><br />
a [[Debugging|debugging]] function to walk the move generation tree of strictly [[Legal Move|legal moves]] to count all the [[Leaf Node|leaf nodes]] of a certain [[Depth|depth]], which can be compared to [[Perft Results|predetermined values]] and used to isolate [[Engine Testing#bugs|bugs]]. In perft, nodes are only counted at the end after the last [[Make Move|makemove]]. Thus "higher" [[Terminal Node|terminal nodes]] (e.g. mate or stalemate) are not counted, instead the number of move paths of a certain depth. Perft ignores draws by [[Repetitions|repetition]], by the [[Fifty-move Rule|fifty-move rule]] and by [[Material#InsufficientMaterial|insufficient material]]. By recording the amount of time taken for each iteration, it's possible to compare the performance of different move generators or the same generator on different machines, though this must be done with caution since there are variations to perft.<br />
<br />
=Perft function= <br />
A simple perft function in [[C]] looks as the following:<br />
<pre><br />
typedef unsigned long long u64;<br />
<br />
u64 Perft(int depth)<br />
{<br />
MOVE move_list[256];<br />
int n_moves, i;<br />
u64 nodes = 0;<br />
<br />
if (depth == 0) <br />
return 1ULL;<br />
<br />
n_moves = GenerateLegalMoves(move_list);<br />
for (i = 0; i < n_moves; i++) {<br />
MakeMove(move_list[i]);<br />
nodes += Perft(depth - 1);<br />
UndoMove(move_list[i]);<br />
}<br />
return nodes;<br />
}<br />
</pre><br />
<br />
=Speed up=<br />
==<span id="Bulk"></span>Bulk-counting==<br />
Assuming the above code used a legal move generator. The algorithm is simple, short but it makes moves for every node even they are the leave (ends of branches). It could improve speed significantly: instead of counting nodes at "depth 0", legal move generators can take advantage of the fact that the number of moves generated at "depth 1" represents the accurate Perft value for that branch. Therefore they can skip the last [[Make Move|makemove]]/[[Unmake Move|undomove]], which gives much faster results and is a better indicator of the raw move generator speed (versus move generator + make/unmake). However, this can cause some confusion when comparing Perft values and may make the task of collecting some extra information such as the number of captures and checks be almost impossible. <br />
<br />
<pre><br />
u64 Perft(int depth /* assuming >= 1 */)<br />
{<br />
MOVE move_list[256];<br />
int n_moves, i;<br />
u64 nodes = 0;<br />
<br />
n_moves = GenerateLegalMoves(move_list);<br />
<br />
if (depth == 1) <br />
return (u64) n_moves;<br />
<br />
for (i = 0; i < n_moves; i++) {<br />
MakeMove(move_list[i]);<br />
nodes += Perft(depth - 1);<br />
UndoMove(move_list[i]);<br />
}<br />
return nodes;<br />
}<br />
</pre><br />
<br />
==Pseudo Legal Moves==<br />
To generate legal moves some programs have to make moves first, call a function to check if the position incheck and then undo those moves. That makes the above Perft function to make and undo moves twice for all moves. Below code can avoid that problem and run much faster:<br />
<pre><br />
u64 Perft(int depth)<br />
{<br />
MOVE move_list[256];<br />
int n_moves, i;<br />
u64 nodes = 0;<br />
<br />
if (depth == 0) <br />
return 1ULL;<br />
<br />
n_moves = GenerateMoves(move_list);<br />
for (i = 0; i < n_moves; i++) {<br />
MakeMove(move_list[i]);<br />
if (!IsIncheck())<br />
nodes += Perft(depth - 1);<br />
UndoMove(move_list[i]);<br />
}<br />
return nodes;<br />
}<br />
</pre><br />
<br />
==Hashing== <br />
Perft can receive another speed boost by [[Hash Table|hashing]] node counts, with a small chance for inaccurate results. Sometimes this is used as a sanity check to make sure the hash table and keys are working correctly.<br />
<br />
=Divide= <br />
The Divide command is often implemented as a variation of Perft, listing all moves and for each move, the perft of the decremented depth. However, some programs already give "divided" output for Perft. Below is output of [[Stockfish]] when computing perft 5 for start position:<br />
<br />
<pre><br />
go perft 5<br />
a2a3: 181046<br />
b2b3: 215255<br />
c2c3: 222861<br />
d2d3: 328511<br />
e2e3: 402988<br />
f2f3: 178889<br />
g2g3: 217210<br />
h2h3: 181044<br />
a2a4: 217832<br />
b2b4: 216145<br />
c2c4: 240082<br />
d2d4: 361790<br />
e2e4: 405385<br />
f2f4: 198473<br />
g2g4: 214048<br />
h2h4: 218829<br />
b1a3: 198572<br />
b1c3: 234656<br />
g1f3: 233491<br />
g1h3: 198502<br />
<br />
Nodes searched: 4865609<br />
</pre><br />
<br />
=Purposes=<br />
Perft is mostly for debugging purposes. It works mainly with functions: move generators, make move, unmake move. They all are very basic and vital for chess engines. By comparing Perft results developers can find out if those functions work correctly or not. If they are incorrect developers can narrow quickly by comparing branches, then call Perft for wrong branches with lower depth, repeat until finding direct positions which give the wrong result.<br />
<br />
Other purposes:<br />
* give a quick glance at how good/bad their generators/make/unmake functions are, compared with the speed of other engines<br />
* calculate branch factors<br />
* a factor to estimate how the complexity of chess variants, by comparing branch factors or Perft results at a given depth for their starting positions<br />
<br />
=History= <br />
Supposably, perft was first implemented within the [[Cobol]] program [[RSCE-1]] by [[Rolf C. Smith#RCSmith|R.C. Smith]], submitted to the [https://en.wikipedia.org/wiki/United_States_Chess_Federation USCF] for evaluation, and subject of an [[Timeline#1978|1978]] [[Computerworld]] article <ref>[http://news.google.com/newspapers?nid=849&dat=19780417&id=h8lOAAAAIBAJ&sjid=DEoDAAAAIBAJ&pg=6180,1080528 Written in Cobol - Program Written as Chess Buff's Research Aid] by Brad Schultz, [[Computerworld]], April 17, 1978, Page 37</ref> . RSCE-1's purpose was not to play chess games, but position analysis, to find forced [[Checkmate|mates]], and to perform a move path enumeration of up to three [[Ply|plies]], with the [[Perft Results|perft(3) result]] of 8,902 from the [[Initial Position|initial position]] already mentioned <ref>[http://www.talkchess.com/forum/viewtopic.php?t=41373 Perft(3) from 1978, with a twist!] by [[Steven Edwards]], [[CCC]], December 08, 2011</ref>. [[Ken Thompson]] may have calculated perft(3) and perft(4) earlier than this date with [[Belle]]. [[Steven Edwards]] suggested the move path enumeration in 1995 as implemented in [[Spector]] <ref>[https://groups.google.com/d/msg/rec.games.chess.computer/M8V1AzkfOok/YV9lcfOlfgIJ Re: Speed of Move Generator] by [[Steven Edwards]], [[Computer Chess Forums|rgcc]], August 16, 1995</ref> and has since been actively involved in Perft computations, while the term "Perft" was likely coined by a [[Crafty]] command, despite its initial implementation was not conform to the above definition <ref>[https://groups.google.com/d/msg/rec.games.chess.computer/2nqtCdHC-r0/ENqomE2u51kJ Re: complete opening tree stats] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], February 05, 1998</ref>.<br />
<br />
In '''December 2003''', [[Albert Bertilsson]] started a distributed project <ref>[https://www.stmintz.com/ccc/index.php?id=335026 Distributed perft project] by [[Albert Bertilsson]], [[CCC]], December 09, 2003</ref> to calculate perft(11) of the [[Initial Position|initial position]], taking over a week to calculate <ref>[https://web.archive.org/web/20061014115710/http://www.albert.nu/programs/dperft/ Distributed Perft Project] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine])</ref> . Exact Perft numbers have been computed and verified up to a depth of 13 by Edwards and are now available in the [https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences On-Line Encyclopedia of Integer Sequences] <ref>[http://oeis.org/A048987 A048987] from [https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences On-Line Encyclopedia of Integer Sequences] (OEIS)</ref> , and are given under [[Initial Position Summary]]. A so far unverified claim for perft('''14''') of 61,885,021,521,585,529,237 was given by [[Peter Österlund]] in '''April 2013''' <ref>[http://talkchess.com/forum/viewtopic.php?topic_view=threads&p=513308&t=47335 Re: Perft(14) estimates thread] by [[Peter Österlund]], [[CCC]], April 02, 2013</ref>, while [[Daniel Shawul]] proposed Perft estimation applying [[Monte-Carlo Tree Search|Monte carlo methods]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47740&start=2 MC methods] by [[Daniel Shawul]], [[CCC]], April 11, 2013</ref> <ref>[[Daniel Shawul|Daniel S. Abdi]] ('''2013'''). ''Monte carlo methods for estimating game tree size''. [https://dl.dropboxusercontent.com/u/55295461/perft/perft.pdf pdf]</ref>. <br />
<span id="15"></span><br />
In '''August 2017''', [[Ankan Banerjee]], who already confirmed Peter Österlund's perft('''14''') in September 2016 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016</ref>, computed perft('''15''') of 2,015,099,950,053,364,471,960 with his [[GPU]] perft program <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref>, running it several days two times with different [[Zobrist Hashing|zobrist keys]] on a cluster of [https://en.wikipedia.org/wiki/Nvidia_DGX-1 Nvidia DGX-1] server systems <ref>[https://www.nvidia.com/en-us/data-center/dgx-1/ DGX-1 for AI Research | NVIDIA]</ref>. His program starts exploring the tree in [[Depth-First|depth first]] manner on CPU. When a certain depth is reached a GPU function (kernel) is launched to compute perft of the subtree in [[Best-First|breadth first]] manner <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017</ref>. Ankan Banerjee dedicated his computations in honor to [[Steven Edwards]] - whose tireless efforts for verifying perft(14) encouraged him to verify perft(14) and take up the challenge to compute perft(15) <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017</ref>.<br />
<br />
=Quotes=<br />
by [[Robert Hyatt]] in a forum post, June 12, 2020 <ref>[http://talkchess.com/forum3/viewtopic.php?f=7&t=74153 Re: Perft speed and depth questions] by [[Mark Buisseret]], [[Computer Chess Forums|CCRL Discussion Board]], June 12, 2020</ref> :<br />
I believe I was the first to use this. Back in the 80's. We rewrote the move generator in Cray Blitz in assembly language. It was a pain to debug. I decided on the "perft" approach solely to test/debug the move generator. We'd run two versions, one FORTRAN, one assembly, and we tested and debugged until they matched.<br />
I carried this over into Crafty as early versions went through several different approaches on move generation. Starting with the Slate/Atkin approach, then rotated bit boards (which took some time to debug), and the magic. It was really intended solely for that purpose. Then several started to use it as a benchmark for speed. I never followed that path since move generation is a very small part of the overall CPU time burned.<br />
Speed here is not so important. I doubt anyone's move generator takes more than 10% of total search time, which means a 20% improvement in perft numbers is only a 2% overall speed gain. I would not worry about anything but matching the node counts exactly...<br />
<br />
=Results= <br />
* [[Perft Results]]<br />
* [[Chess960 Perft Results]]<br />
* [[Chinese Chess Perft Results]]<br />
<br />
=Publications= <br />
* [[Aart Bik]] ('''2012'''). ''Computing Deep Perft and Divide Numbers for Checkers''. [[ICGA Journal#35_4|ICGA Journal, Vol. 35, No. 4]] » [[Checkers]]<br />
* [[Daniel Shawul|Daniel S. Abdi]] ('''2013'''). ''Monte carlo methods for estimating game tree size''. <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47740&topic_view=flat&start=11 Re: MC methods] by [[Daniel Shawul]], [[CCC]], April 13, 2013</ref> » [[Monte-Carlo Tree Search]]<br />
<br />
=Forum Posts= <br />
==1995 ...==<br />
* [https://groups.google.com/d/msg/rec.games.chess.computer/M8V1AzkfOok/YV9lcfOlfgIJ Re: Speed of Move Generator] by [[Steven Edwards]], [[Computer Chess Forums|rgcc]], August 16, 1995 » [[Spector]] <br />
* [https://groups.google.com/d/msg/rec.games.chess.computer/2nqtCdHC-r0/ENqomE2u51kJ Re: complete opening tree stats] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], February 05, 1998 » [[Crafty]] <br />
==2000 ...== <br />
* [https://www.stmintz.com/ccc/index.php?id=107258 Testing speed of "position visiting"] by [[Tom Kerrigan]], [[CCC]], April 23, 2000<br />
* [https://groups.google.com/d/msg/rec.games.chess.computer/ek5fbFf4ajc/lrPxv2kDHgkJ Experiments with crafty perft command] by Guy Macon, [[Computer Chess Forums|rgcc]], December 10, 2000 <br />
* [https://www.stmintz.com/ccc/index.php?id=198498 Who is the champion in calculating perft?] by [[Uri Blass]], [[CCC]], November 22, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=234025 Perft 5,6 {Fastest program is List}] by [[Dann Corbit]], [[CCC]], June 04, 2002 » [[List (Program)|List]]<br />
* [https://www.stmintz.com/ccc/index.php?id=275133 Perft revisited] by [[Normand M. Blais]], [[CCC]], January 05, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=276596 perft question] by [[Joel Veness]], [[CCC]], January 12, 2003<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=18&t=41318 Perft] by [[Andreas Herrmann]], [[Computer Chess Forums|Winboard Forum]], February 18, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=326134 Highest perft for initial position?] by [[Albert Bertilsson]], [[CCC]], November 07, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=335026 Distributed perft project] by [[Albert Bertilsson]], [[CCC]], December 09, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=334499 Perft(10) verified] by [[Albert Bertilsson]], [[CCC]], December 09, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=335527 Distributed perft, current standings and trends] by [[Albert Bertilsson]], [[CCC]], December 12, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=336985 Hashing in distributed perft] by [[Steffen A. Jakob|Steffen Jakob]], [[CCC]], December 19, 2003<br />
* [https://www.stmintz.com/ccc/index.php?id=386249 perft records] by [[Peter Fendrich]], [[CCC]], September 06, 2004<br />
* [https://www.stmintz.com/ccc/index.php?id=388806 perft results (how accurate is accurate enough ?)] by [[Roman Hartmann]], [[CCC]], September 23, 2004<br />
* [https://www.stmintz.com/ccc/index.php?id=394229 FRC Perft] by Jürgen Suntinger, [[CCC]], November 02, 2004<br />
==2005 ...== <br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=1377 perft question] by [[Sven Schüle]], [[Computer Chess Forums|Winboard Forum]], January 19, 2005<br />
* [https://www.stmintz.com/ccc/index.php?id=466491 Perft vs Search Re: Cache size does matter] by [[Brian Richardson]], [[CCC]], December 03, 2005<br />
* [https://www.stmintz.com/ccc/index.php?id=488816 Perft -- Test position and data] by [[Charles Roberson]], [[CCC]], February 23, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=20834 A perft faster than qperft?!] by [[Allard Siemelink]], [[CCC]], April 24, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=23634 Perft problems...] by [[Chris Tatham]], [[CCC]], September 10, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=27334 What is perft(x) exactly meaning?] by [[Jouni Uski]], [[CCC]], April 06, 2009<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=29425 Perft and mate] by [[Stefano Gemma]], [[CCC]], August 16, 2009 » [[Freccia]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=30754 Perft and insufficient material] by [[Sven Schüle]], [[CCC]], November 23, 2009<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=34025 Does perft include underpromotion?] by [[Rasjid Chan|Chan Rasjid]], [[CCC]], April 27, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38235 Perft 12 in progress] by [[Steven Edwards]], [[CCC]], February 27, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38862 Perft(12) count confirmed] by [[Steven Edwards]], [[CCC]], April 25, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39678 Perft(13) betting pool] by [[Steven Edwards]], [[CCC]], July 10, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=40108 Fastest perft] by ethan ara, [[CCC]], August 19, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=41373 Perft(3) from 1978, with a twist!] by [[Steven Edwards]], [[CCC]], December 08, 2011 <ref>[http://news.google.com/newspapers?nid=849&dat=19780417&id=h8lOAAAAIBAJ&sjid=DEoDAAAAIBAJ&pg=6180,1080528 Written in Cobol - Program Written as Chess Buff's Research Aid] by Brad Schultz, [[Computerworld]], April 17, 1978, Page 37</ref><br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42512 estimating the number of possible stalemates in perft(n)] by [[Uri Blass]], [[CCC]], February 18, 2012 » [[Stalemate]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42600 Shatranj perfts] by [[Paul Byrne]], [[CCC]], February 24, 2012 » [[Shatranj]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=45099 Perft and en_passant] by [[Harald Lüßen]], [[CCC]], September 11, 2012 » [[En passant]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46004 about perft, what is the proper way of doing it?] by Fred Piche, [[CCC]], November 14, 2012<br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47318 A few positions to test movegen] by [[Martin Sedlak]], [[CCC]], February 24, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47335 Perft(14) estimates thread] by [[Steven Edwards]], [[CCC]], February 26, 2013<br />
: [http://talkchess.com/forum/viewtopic.php?topic_view=threads&p=513308&t=47335 Re: Perft(14) estimates thread] by [[Peter Österlund]], [[CCC]], April 02, 2013 » 61,885,021,521,585,529,237<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47740 Perft(15) estimates thread] by [[Steven Edwards]], [[CCC]], April 10, 2013<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=47740&start=2 MC methods] by [[Daniel Shawul]], [[CCC]], April 11, 2013 » [[Monte-Carlo Tree Search]]<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=47740&topic_view=flat&start=11 Re: MC methods] by [[Daniel Shawul]], [[CCC]], April 13, 2013<br />
* [http://www.chessprogramming.net/computerchess/is-perft-speed-important/ Is Perft Speed Important?] by [[Steve Maughan]], [http://www.chessprogramming.net/ Computer Chess Programming], April 19, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48217 Perft search speed bottleneck] by Jim Jarvis, [[CCC]], June 07, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[GPU]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48423 A perft() benchmark] by [[Steven Edwards]], [[CCC]], June 26, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48491 gperft] by [[Paul Byrne]], [[CCC]], July 01, 2013<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=52965 perft/divide bug in roce38 and Sharper? [SOLVED]] by thedrunkard, [[Computer Chess Forums|Winboard Forum]], October 16, 2013 » [[ROCE]], [[Sharper]]<br />
'''2014'''<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=2238 Perft and Captures] by [[Christian Daley|CDaley11]], [[Computer Chess Forums|OpenChess Forum]], January 24, 2013 » [[Captures]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53224 Perft(14) revisited] by [[Steven Edwards]], [[CCC]], August 08, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53406 Perft(14) Weekly Status Report] by [[Steven Edwards]], [[CCC]], August 24, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53408 Non recursive perft()] by [[Steven Edwards]], [[CCC]], August 24, 2014 » [[Iterative Search]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53439 OpenCL perft() Technical Issues] by [[Steven Edwards]], [[CCC]], August 26, 2014 » [[OpenCL]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53743 A method guaranteed to localize the toughest perft() bugs] by [[Steven Edwards]], [[CCC]], September 18, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54528 FRC / Chess960 Engine with "Divided" Command] by [[Steve Maughan]], [[CCC]], December 02, 2014 » [[Chess960]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54719 Handling integer overflow for certain perft() calculations] by [[Steven Edwards]], [[CCC]], December 22, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54767 Perft(14) verification] by [[Steven Edwards]], [[CCC]], December 28, 2014<br />
==2015 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54853 Perft(14) Weekly Status Reports for 2015] by [[Steven Edwards]], [[CCC]], January 04, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55195 Perft(15)] by [[Steven Edwards]], [[CCC]], February 09, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55274 Some Chess960/FRC positions to be confirmed] by [[Reinhard Scharnagl]], [[CCC]], February 09, 2015 » [[Chess960]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55896 An MPI perft program] by [[Chao Ma]], [[CCC]], April 05, 2015 » [[Parallel Search]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=56523 Deep split perft()] by [[Steven Edwards]], [[CCC]], May 29, 2015 » [[Thread]]<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=2855 Please comment on my Perft speeds] by ppyvabw, [[Computer Chess Forums|OpenChess Forum]], July 10, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=56998 100 easy perft(7) test positions] by [[Steven Edwards]], [[CCC]], July 17, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=57417 Perft using nullmove] by [[Lasse Hansen]], [[CCC]], August 29, 2015<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=2913 Perft and hash with legal move generator] by [[Izak Pretorius|Peterpan]], [[Computer Chess Forums|OpenChess Forum]], November 12, 2015 » [[Transposition Table]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=58406 Auriga - distributed and collaborative Perft] by [[Giuseppe Cannella]], [[CCC]], November 28, 2015 <ref>[http://cinnamonchess.altervista.org/auriga Auriga] by [[Giuseppe Cannella]]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=58726 Perft(14) Weekly Status Reports for 2016] by [[Steven Edwards]], [[CCC]], December 29, 2015<br />
'''2016'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59046 Best way to debug perft?] by Meni Rosenfeld, [[CCC]], January 25, 2016 » [[Debugging]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59567 Perft, leaf nodes?] by [[Luis Babboni]], [[CCC]], March 19, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60108 reverse perft] by [[Alexandru Mosoi]], [[CCC]], May 09, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60445 Perft for Xiangqi & Shogi] by [[Patrice Duhamel]], [[CCC]], June 12, 2016 » [[Chinese Chess|Xiangqi]], [[Shogi]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61119 yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], August 13, 2016<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016 <br />
'''2017 ...'''<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=3063 Perft results?] by notachessplayer, [[Computer Chess Forums|OpenChess Forum]], January 01, 2017<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983 perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017 » [[Perft#15|Perft(15)]]<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017 <br />
: [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70530 No standard specification for Perft] by [[Michael Sherwin]], [[CCC]], April 19, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71379 You gotta love Perft... just not too much!] by [[Martin Bryant]], [[CCC]], July 27, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71550 Shogi Perft numbers] by [[Toni Helminen]], [[CCC]], August 14, 2019 » [[Shogi]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=72669 LastEmperor - Chess960 perft tool] by [[Toni Helminen|JohnWoe]], [[CCC]], December 28, 2019 » [[Chess960 Perft Results]]<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=73493 Where to enter/read position into hash table in perft?] by [[Marcel Vanthoor]], [[CCC]], March 28, 2020 » [[Transposition Table]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=73558 Count the number of nodes of perft(14) and beyond] by [[Marc-Philippe Huget]], [[CCC]], April 04, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=73625 magic bitboard perft] by [[Richard Delorme]], [[CCC]], April 11, 2020 » [[Magic Bitboards]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=73577 Perft speed optimization] by [[Marcel Vanthoor]], [[CCC]], April 06, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=73812 Request for InDoubleCheck PERFTS EPDs] by [[Chris Whittington]], [[CCC]], May 02, 2020 » [[Double Check]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74153 Perft speed and depth questions] by [[Mark Buisseret]], [[CCC]], June 12, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75877 Place to find correct perft result from a fen position] by [[Elias Nilsson]], [[CCC]], November 20, 2020 » [[Perft Results]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76430 Chinese chess Xiangqi perft results] by [[Maksim Korzh]], [[CCC]], January 27, 2021 » [[Chinese Chess Perft Results]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77054 PERFT transposition table funny?!] by [[Martin Bryant]], [[CCC]], April 10, 2021 » [[Transposition Table]], [[Memory]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77069 Perft 7 -> 1.6 trillion moves] by [[Michael Byrne|MikeB]], [[CCC]], April 12, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77350 Being silly with perft and legal move generation] by [[Jakob Progsch]], [[CCC]], May 19, 2021 » [[Move Generation#Legal|Legal Move Generation]], [[En passant]]<br />
* [http://talkchess.com/forum3/viewtopic.php?f=7&t=78119 Perft position to debug check-evasions via en passant capture] by [[Roland Tomasi]], [[CCC]], September 06, 2021<br />
* [http://talkchess.com/forum3/viewtopic.php?f=7&t=78241 Perft test] by [[Pierluigi Meloni]], [[CCC]], September 24, 2021<br />
* [http://talkchess.com/forum3/viewtopic.php?f=7&t=78230 Gigantua: 1.5 Giganodes per Second per Core move generator] by [[Daniel Infuehr]], [[CCC]], September 22, 2021<br />
* [http://talkchess.com/forum3/viewtopic.php?f=7&t=78352 Gigantua: 2 Gigamoves per Second per Core move generator - Sourcecode Release] by [[Daniel Infuehr]], [[CCC]], October 07, 2021<br />
* [http://talkchess.com/forum3/viewtopic.php?f=7&t=80952 My Perft Results] by [[JoAnn Peeler]], [[CCC]], November 04, 2022<br />
* [https://talkchess.com/viewtopic.php?t=83392 Perft(16) estimate after averaging MC samples.] by Ajedrecista, [[CCC]], February 26, 2024<br />
<br />
=External Links= <br />
* [http://oeis.org/A048987 A048987] from [https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences On-Line Encyclopedia of Integer Sequences] (OEIS)<br />
* [https://wismuth.com/chess/statistics-games.html Statistics on chess games] by [[Mathematician#FLabelle|François Labelle]]<br />
* [https://en.wikipedia.org/wiki/Performance_testing Performance testing from Wikipedia]<br />
* [https://web.archive.org/web/20061014115710/http://www.albert.nu/programs/dperft/ Distributed Perft Project] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine])<br />
==Perft in other Games==<br />
* [http://tonyjh.com/chess/technical/ Perft for other forms of Chess] by [[Tony Hecker]]<br />
* [http://checker-board.blogspot.com/2009/02/perft-for-checkers.html Perft for Checkers] by [[Martin Fierz]]<br />
* [http://www.aartbik.com/strategy.php Perft for Checkers and Reversi/Othello] by [[Aart Bik]]<br />
==Implementations==<br />
* [https://home.hccnet.nl/h.g.muller/dwnldpage.html µ-Max Dowload Page - qperft] by [[Harm Geert Muller]] » [[Micro-Max]]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [http://cinnamonchess.altervista.org/auriga Auriga] by [[Giuseppe Cannella]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=58406 Auriga - distributed and collaborative Perft] by [[Giuseppe Cannella]], [[CCC]], November 28, 2015</ref><br />
* [https://github.com/Mk-Chan/BBPerft BBPerft: A fast, bitboard based chess perft result generator] by [[Manik Charan]] derived from [[WyldChess]]<br />
* [http://brausch.org/home/chess/index.html Chess Engine OliThink - New Move Generator OliPerft (Pre OliThink 5)] by [[Oliver Brausch]] » [[OliThink]]<br />
* [http://www.craftychess.com/documentation/craftydoc.html Crafty Command Documentation] by [[Robert Hyatt]], see perft <depth> » [[Crafty]]<br />
* [https://github.com/abulmo/hqperft hqperft: Chess move generation based on (H)yperbola (Q)uintessence & range attacks] by [[Richard Delorme]] » [[Hyperbola Quintessence]]<br />
* [https://github.com/jniemann66/juddperft GitHub - jniemann66/juddperft: Chess move generation engine] by [[Judd Niemann]]<br />
* [http://www.rocechess.ch/perft.html perft, divide, debugging a move generator] from [[ROCE]] by [[Roman Hartmann]]<br />
* [http://marcelk.net/rookie/nostalgia/v3/perft-random.epd perft-random.epd] by [[Marcel van Kervinck]] » [[Rookie]]<br />
==Video Tutorial== <br />
* A quick overview of the perft process by [[Jonathan Warkentin]], [https://en.wikipedia.org/wiki/YouTube YouTube] Videos<br />
: {{#evu:https://www.youtube.com/watch?v=A0HJbwRwILk|alignment=left|valignment=top}}<br />
* An example of debugging a perft error by [[Jonathan Warkentin]], [https://en.wikipedia.org/wiki/YouTube YouTube] Videos<br />
: {{#evu:https://www.youtube.com/watch?v=bAONObdxF54|alignment=left|valignment=top}}<br />
* Improving the perft speed & debugging tips by [[Jonathan Warkentin]], [https://en.wikipedia.org/wiki/YouTube YouTube] Videos<br />
: {{#evu:https://www.youtube.com/watch?v=BIzAfg5sdqg|alignment=left|valignment=top}}<br />
<br />
=References= <br />
<references /><br />
'''[[Move Generation|Up one level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Perft_Results&diff=26924Perft Results2024-03-15T16:00:10Z<p>Smatovic: /* 2020 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Board Representation]] * [[Move Generation]] * [[Perft]] * Results'''<br />
<br />
This page contains detailed [[Perft|perft]] results for several positions that are useful for [[Debugging|debugging]], beginning with the start position. [[Captures]], [[Checkmate|checkmates]], and other information have been included along with the node counts ([[Leaf Node|leaf nodes]], excluding internal or [[Interior Node|interior nodes]]) or movepath enumerations. The move counters consider moves to the leaf positions only.<br />
<br />
=Initial Position= <br />
Obviously, Perft(1) of the [[Initial Position|initial position]] is 20, Perft(2) 400. Data of Perft(10) up to Perft(13) was provided by [[Steven Edwards]], generated by [[Symbolic]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=46055 Perft FEN data] by [[Steven Edwards]], [[CCC]], November 18, 2012</ref>, when Perft(14)<ref>[http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016</ref> and Perft(15)<ref>[http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017</ref> were provided by [[Ankan Banerjee]].<br />
<fentt border="double" style="font-size:24pt">rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR</fentt> <br />
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth<br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks<br />
! Discovery Checks<br />
! Double Checks <br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 0 <br />
| style="text-align:right;" | 1 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 20 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 400 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 8,902 <br />
| style="text-align:right;" | 34 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 12 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 197,281 <br />
| style="text-align:right;" | 1576 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 469 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 8 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 4,865,609 <br />
| style="text-align:right;" | 82,719 <br />
| style="text-align:right;" | 258 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 27,351 <br />
| style="text-align:right;" | 6<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 347 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 119,060,324 <br />
| style="text-align:right;" | 2,812,008 <br />
| style="text-align:right;" | 5248 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 809,099<br />
| style="text-align:right;" | 329 <br />
| style="text-align:right;" | 46 <br />
| style="text-align:right;" | 10,828 <br />
|-<br />
| style="text-align:center;" | 7 <br />
| style="text-align:right;" | 3,195,901,860 <br />
| style="text-align:right;" | 108,329,926<br />
| style="text-align:right;" | 319,617<br />
| style="text-align:right;" | 883,453<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 33,103,848<br />
| style="text-align:right;" | 18,026<br />
| style="text-align:right;" | 1628<br />
| style="text-align:right;" | 435,767<br />
|-<br />
| style="text-align:center;" | 8 <br />
| style="text-align:right;" | 84,998,978,956 <br />
| style="text-align:right;" | 3,523,740,106 <br />
| style="text-align:right;" | 7,187,977 <br />
| style="text-align:right;" | 23,605,205 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 968,981,593 <br />
| style="text-align:right;" | 847,039 <br />
| style="text-align:right;" | 147,215 <br />
| style="text-align:right;" | 9,852,036 <br />
|-<br />
| style="text-align:center;" | 9 <br />
| style="text-align:right;" | 2,439,530,234,167<br />
| style="text-align:right;" | 125,208,536,153 <br />
| style="text-align:right;" | 319,496,827 <br />
| style="text-align:right;" | 1,784,356,000 <br />
| style="text-align:right;" | 17,334,376 <br />
| style="text-align:right;" | 36,095,901,903 <br />
| style="text-align:right;" | 37,101,713 <br />
| style="text-align:right;" | 5,547,231 <br />
| style="text-align:right;" | 400,191,963 <br />
|-<br />
| style="text-align:center;" | 10 <br />
| style="text-align:right;" | 69,352,859,712,417 <br />
|-<br />
| style="text-align:center;" | 11 <br />
| style="text-align:right;" | 2,097,651,003,696,806 <br />
|-<br />
| style="text-align:center;" | 12 <br />
| style="text-align:right;" | 62,854,969,236,701,747 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=38862 Perft(12) count confirmed] by [[Steven Edwards]], [[Computer Chess Forums|CCC]], April 25, 2011</ref> <br />
|-<br />
| style="text-align:center;" | 13 <br />
| style="text-align:right;" | 1,981,066,775,000,396,239 <br />
|-<br />
| style="text-align:center;" | 14 <br />
| style="text-align:right;" | 61,885,021,521,585,529,237 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016</ref> <br />
|-<br />
| style="text-align:center;" | 15 <br />
| style="text-align:right;" | 2,015,099,950,053,364,471,960 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017</ref> <br />
|}<br />
<br />
==Perft 10== <br />
* [[Perft(10) 20 draft 9 Positions]]<br />
* [[Perft(10) 400 draft 8 Positions]]<br />
<br />
==Perft 11== <br />
* [[Perft(11) 20 draft 10 Positions]]<br />
* [[Perft(11) 400 draft 9 Positions]]<br />
<br />
==Perft 12== <br />
* [[Perft(12) 20 draft 11 Positions]]<br />
* [[Perft(12) 400 draft 10 Positions]]<br />
<br />
==Perft 13== <br />
* [[Perft(13) 20 draft 12 Positions]]<br />
* [[Perft(13) 400 draft 11 Positions]]<br />
<br />
==Summary== <br />
* [[Initial Position Summary]]<br />
* [[Perft#15|Perft(15)]]<br />
<span id="kiwipete"></span><br />
=Position 2= <br />
also known as '''Kiwipete''' by [[Peter McKenzie]] <ref>[https://www.stmintz.com/ccc/index.php?id=274926 kiwipete perft position] by [[Russell Reagan]], [[CCC]], January 04, 2003</ref>. The number of double-checks in depth 5 is discussed in [[CCC|Talkchess]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=7&t=78402 Perft statistics - chessprogrammingwiki] by [[Murat Yirci]], [[CCC]], October 10, 2021</ref> and may be 2645 instead of 2637.<br />
<fentt border="double" style="font-size:24pt">r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R</fentt> <br />
r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks <br />
! Discovery Checks<br />
! Double Checks<br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 48 <br />
| style="text-align:right;" | 8 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 2 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 2039 <br />
| style="text-align:right;" | 351 <br />
| style="text-align:right;" | 1 <br />
| style="text-align:right;" | 91 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 3 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0<br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 97862 <br />
| style="text-align:right;" | 17102 <br />
| style="text-align:right;" | 45 <br />
| style="text-align:right;" | 3162 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 993 <br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 1 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 4085603 <br />
| style="text-align:right;" | 757163 <br />
| style="text-align:right;" | 1929 <br />
| style="text-align:right;" | 128013 <br />
| style="text-align:right;" | 15172 <br />
| style="text-align:right;" | 25523<br />
| style="text-align:right;" | 42<br />
| style="text-align:right;" | 6 <br />
| style="text-align:right;" | 43 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 193690690 <br />
| style="text-align:right;" | 35043416 <br />
| style="text-align:right;" | 73365 <br />
| style="text-align:right;" | 4993637 <br />
| style="text-align:right;" | 8392 <br />
| style="text-align:right;" | 3309887 <br />
| style="text-align:right;" | 19883<br />
| style="text-align:right;" | 2637<br />
| style="text-align:right;" | 30171 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 8031647685<br />
| style="text-align:right;" | 1558445089<br />
| style="text-align:right;" | 3577504<br />
| style="text-align:right;" | 184513607<br />
| style="text-align:right;" | 56627920<br />
| style="text-align:right;" | 92238050<br />
| style="text-align:right;" | 568417<br />
| style="text-align:right;" | 54948<br />
| style="text-align:right;" | 360003<br />
|}<br />
<span id="Position3"></span><br />
<br />
=Position 3= <br />
<fentt border="double" style="font-size:24pt">8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8</fentt> <br />
8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8 w - - <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks<br />
! Discovery Checks<br />
! Double Checks <br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 14 <br />
| style="text-align:right;" | 1 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 2 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 191 <br />
| style="text-align:right;" | 14 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 10 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 2812 <br />
| style="text-align:right;" | 209 <br />
| style="text-align:right;" | 2 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 267 <br />
| style="text-align:right;" | 3 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 43238 <br />
| style="text-align:right;" | 3348 <br />
| style="text-align:right;" | 123 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 1680 <br />
| style="text-align:right;" | 106 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 17 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48811 Impossible perft question] by [[Andy Duplain]], [[CCC]], August 01, 2013</ref> 674624 <br />
| style="text-align:right;" | 52051 <br />
| style="text-align:right;" | 1165 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 52950<br />
| style="text-align:right;" | 1292 <br />
| style="text-align:right;" | 3 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 11030083 <br />
| style="text-align:right;" | 940350 <br />
| style="text-align:right;" | 33325 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 7552 <br />
| style="text-align:right;" | 452473 <br />
| style="text-align:right;" | 26067 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 2733 <br />
|-<br />
| style="text-align:center;" | 7 <br />
| style="text-align:right;" | 178633661 <br />
| style="text-align:right;" | 14519036 <br />
| style="text-align:right;" | 294874 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 140024 <br />
| style="text-align:right;" | 12797406 <br />
| style="text-align:right;" | 370630 <br />
| style="text-align:right;" | 3612 <br />
| style="text-align:right;" | 87 <br />
|-<br />
| style="text-align:center;" | 8 <br />
| style="text-align:right;" | 3009794393<br />
| style="text-align:right;" | 267586558<br />
| style="text-align:right;" | 8009239<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 6578076<br />
| style="text-align:right;" | 135626805<br />
| style="text-align:right;" | 7181487 <br />
| style="text-align:right;" | 1630 <br />
| style="text-align:right;" | 450410<br />
|}<br />
<span id="Position4"></span><br />
=Position 4= <br />
<fentt border="double" style="font-size:24pt">r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1</fentt> <br />
r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1<br />
Or mirrored (with the same perft results):<br />
r2q1rk1/pP1p2pp/Q4n2/bbp1p3/Np6/1B3NBn/pPPP1PPP/R3K2R b KQ - 0 1 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks <br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 6 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 264 <br />
| style="text-align:right;" | 87 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 6 <br />
| style="text-align:right;" | 48 <br />
| style="text-align:right;" | 10 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 9467 <br />
| style="text-align:right;" | 1021 <br />
| style="text-align:right;" | 4 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 120 <br />
| style="text-align:right;" | 38 <br />
| style="text-align:right;" | 22 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 422333 <br />
| style="text-align:right;" | 131393 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 7795 <br />
| style="text-align:right;" | 60032 <br />
| style="text-align:right;" | 15492 <br />
| style="text-align:right;" | 5 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 15833292 <br />
| style="text-align:right;" | 2046173 <br />
| style="text-align:right;" | 6512 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 329464 <br />
| style="text-align:right;" | 200568 <br />
| style="text-align:right;" | 50562 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 706045033 <br />
| style="text-align:right;" | 210369132 <br />
| style="text-align:right;" | 212 <br />
| style="text-align:right;" | 10882006 <br />
| style="text-align:right;" | 81102984 <br />
| style="text-align:right;" | 26973664 <br />
| style="text-align:right;" | 81076 <br />
|}<br />
<span id="Position5"></span><br />
=Position 5= <br />
This position was discussed on [[CCC|Talkchess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42463 REPORT: wrong perft result by qperft] by [[Jesús Muñoz]], [[CCC]], February 14, 2012</ref> and caught bugs in engines several years old at depth 3 <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=450535&t=42463 Re: REPORT: wrong perft result by qperft] by [[Julien Marcel]], [[CCC]], February 14, 2012</ref> and was also reported wrong here <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42463&start=18 Re: Correction of my error] by [[Eugene Kotlov]], [[CCC]], July 17, 2015</ref>, hopefully now corrected with the results given by [[Steven Edwards]], July 18, 2015 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42463&start=21 Re: Correction of my error] by [[Steven Edwards]], [[CCC]], July 18, 2015</ref><br />
<fentt border="double" style="font-size:24pt">rnbq1k1r/pp1Pbppp/2p5/8/2B5/8/PPP1NnPP/RNBQK2R</fentt> <br />
rnbq1k1r/pp1Pbppp/2p5/8/2B5/8/PPP1NnPP/RNBQK2R w KQ - 1 8 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 44 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 1,486 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 62,379 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 2,103,487 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 89,941,194 <br />
|}<br />
<span id="Position6"></span><br />
=Position 6= <br />
An alternative Perft given by [[Steven Edwards]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48616 An altenative perft() initial FEN] by [[Steven Edwards]], [[CCC]], July 11, 2013</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49118 Some perft() results for that alternative test position] by [[Steven Edwards]], [[CCC]], August 26, 2013</ref><br />
<fentt border="double" style="font-size:24pt">r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1</fentt> <br />
r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1 w - - 0 10 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
|-<br />
| style="text-align:center;" | 0 <br />
| style="text-align:right;" | 1 <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 46 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 2,079 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 89,890 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 3,894,594 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 164,075,551 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 6,923,051,137 <br />
|-<br />
| style="text-align:center;" | 7 <br />
| style="text-align:right;" | 287,188,994,746 <br />
|-<br />
| style="text-align:center;" | 8 <br />
| style="text-align:right;" | 11,923,589,843,526 <br />
|-<br />
| style="text-align:center;" | 9 <br />
| style="text-align:right;" | 490,154,852,788,714 <br />
|}<br />
<br />
=See also=<br />
* [[Chess960 Perft Results]]<br />
* [[Chinese Chess Perft Results]]<br />
<br />
=Forum Posts= <br />
==2000 ...== <br />
* [https://www.stmintz.com/ccc/index.php?id=274926 kiwipete perft position] by [[Russell Reagan]], [[CCC]], January 04, 2003 » [[Peter McKenzie]], [[Perft Results#kiwipete|Kiwipete]]<br />
* [https://www.stmintz.com/ccc/index.php?id=388806 perft results (how accurate is accurate enough ?)] by [[Roman Hartmann]], [[CCC]], September 23, 2004<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42463 REPORT: wrong perft result by qperft] by [[Jesús Muñoz]], [[CCC]], February 14, 2012 » [[Perft Results#Position5|Position 5]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=45099 Perft and en_passant] by [[Harald Lüßen]], [[CCC]], September 11, 2012 » [[En passant]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47335 Perft(14) estimates thread] by [[Steven Edwards]], [[CCC]], February 26, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47740 Perft(15) estimates thread] by [[Steven Edwards]], [[CCC]], April 10, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48616 An altenative perft() initial FEN] by [[Steven Edwards]], [[CCC]], July 11, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48811 Impossible perft question] by [[Andy Duplain]], [[CCC]], August 01, 2013 » [[Perft Results#Position3|Position 3]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49000 Wide open perft()] by [[Steven Edwards]], [[CCC]], August 18, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53224 Perft(14) revisited] by [[Steven Edwards]], [[CCC]], August 08, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53406 Perft(14) Weekly Status Report] by [[Steven Edwards]], [[CCC]], August 24, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54767 Perft(14) verification] by [[Steven Edwards]], [[CCC]], December 28, 2014<br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54853 Perft(14) Weekly Status Reports for 2015] by [[Steven Edwards]], [[CCC]], January 04, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54995 Perft for various positions] by [[Alexandru Mosoi]], [[CCC]], January 17, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55274 Some Chess960/FRC positions to be confirmed] by [[Reinhard Scharnagl]], [[CCC]], February 09, 2015 » [[Chess960]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55787 kiwipete perft position] by [[Henk van den Belt]], [[CCC]], March 26, 2015 » [[Perft Results#kiwipete|Kiwipete]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=58726 Perft(14) Weekly Status Reports for 2016] by [[Steven Edwards]], [[CCC]], December 29, 2015<br />
'''2016'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59781 A perft(7) challenge position] by [[Steven Edwards]], [[CCC]], April 07, 2016 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59818 Another perft(7) challenge position] by [[Steven Edwards]], [[CCC]], April 13, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59915 Perft(7) challenge position #3] by [[Steven Edwards]], [[CCC]], April 20, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59957 Perft(7) challenge position #4] by [[Steven Edwards]], [[CCC]], April 25, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59961 Perft(7) challenge position #5] by [[Steven Edwards]], [[CCC]], April 25, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60102 Another perft(7) challenge] by [[Steven Edwards]], [[CCC]], May 08, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60114 Perft(7) challenge position #6] by [[Steven Edwards]], [[CCC]], May 10, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60242 Perft(7) 64 bit hash mismatch set 8] by [[Steven Edwards]], [[CCC]], May 22, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60942 Twenty-nine perft(7) mismatches from work unit 528] by [[Steven Edwards]], [[CCC]], July 25, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61119 yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], August 13, 2016<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61329 Two perft(7) mismatches from work unit 571] by [[Steven Edwards]], [[CCC]], September 04, 2016<br />
'''2017 ...'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983 perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017 » [[Perft#15|Perft(15)]]<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017 <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70543 Contrived position for perft] by [[Michael Sherwin]], [[CCC]], April 21, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71379 You gotta love Perft... just not too much!] by [[Martin Bryant]], [[CCC]], July 27, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71841&p=812577 Level 11 Perft statistics] by [[Andreas Øverland]], [[CCC]], September 17, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75877 Place to find correct perft result from a fen position] by [[Elias Nilsson]], [[CCC]], November 20, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76430 Chinese chess Xiangqi perft results] by [[Maksim Korzh]], [[CCC]], January 27, 2021 » [[Chinese Chess Perft Results]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77069 Perft 7 -> 1.6 trillion moves] by [[Michael Byrne|MikeB]], [[CCC]], April 12, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77350 Being silly with perft and legal move generation] by [[Jakob Progsch]], [[CCC]], May 19, 2021 » [[Move Generation#Legal|Legal Move Generation]], [[En passant]]<br />
* [https://talkchess.com/viewtopic.php?t=83392 Perft(16) estimate after averaging MC samples.] by Ajedrecista, [[CCC]], February 26, 2024<br />
<br />
=External Links= <br />
* [https://oeis.org/A048987 A048987] from [https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences On-Line Encyclopedia of Integer Sequences (OEIS)]<br />
* [https://home.hccnet.nl/h.g.muller/dwnldpage.html µ-Max Dowload Page - qperft] by [[Harm Geert Muller]]<br />
* [https://marcelk.net/rookie/nostalgia/v3/perft-random.epd perft-random.epd] by [[Marcel van Kervinck]]<br />
* [https://craftychess.com/documentation/craftydoc.html Crafty Command Documentation] by [[Robert Hyatt]], see [[Crafty]] perft <depth><br />
* [https://web.archive.org/web/20060430011809/http://www.albert.nu/programs/sharper/perft.htm Sharper - Perft calculation] by [[Albert Bertilsson]] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine])<br />
* [https://web.archive.org/web/20130517080941/http://www.albert.nu/programs/dperft/default.asp Distributed Perft Project] by [[Albert Bertilsson]] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine])<br />
* [http://www.rocechess.ch/perft.html perft, divide, debugging a move generator] from [[ROCE]] by [[Roman Hartmann]]<br />
* [https://sites.google.com/site/numptychess/perft Perft - sample test positions] used by [[Numpty chess]]<br />
* [https://www.stmintz.com/ccc/index.php?terms=perft&search=1 Perft], search the [[Computer Chess Forums|CCC Archives]]<br />
* [https://wismuth.com/chess/statistics-games.html Statistics on chess games] by [[Mathematician#FLabelle|François Labelle]]<br />
* [https://github.com/elcabesa/vajolet/blob/master/tests/perft.txt vajolet/perft.txt at master · elcabesa/vajolet · GitHub] by [[Marco Belli]]<br />
<br />
=References= <br />
<references /><br />
'''[[Perft|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Perft_Results&diff=26923Perft Results2024-03-15T15:59:41Z<p>Smatovic: /* 2020 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Board Representation]] * [[Move Generation]] * [[Perft]] * Results'''<br />
<br />
This page contains detailed [[Perft|perft]] results for several positions that are useful for [[Debugging|debugging]], beginning with the start position. [[Captures]], [[Checkmate|checkmates]], and other information have been included along with the node counts ([[Leaf Node|leaf nodes]], excluding internal or [[Interior Node|interior nodes]]) or movepath enumerations. The move counters consider moves to the leaf positions only.<br />
<br />
=Initial Position= <br />
Obviously, Perft(1) of the [[Initial Position|initial position]] is 20, Perft(2) 400. Data of Perft(10) up to Perft(13) was provided by [[Steven Edwards]], generated by [[Symbolic]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=46055 Perft FEN data] by [[Steven Edwards]], [[CCC]], November 18, 2012</ref>, when Perft(14)<ref>[http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016</ref> and Perft(15)<ref>[http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017</ref> were provided by [[Ankan Banerjee]].<br />
<fentt border="double" style="font-size:24pt">rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR</fentt> <br />
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth<br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks<br />
! Discovery Checks<br />
! Double Checks <br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 0 <br />
| style="text-align:right;" | 1 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 20 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 400 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 8,902 <br />
| style="text-align:right;" | 34 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 12 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 197,281 <br />
| style="text-align:right;" | 1576 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 469 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 8 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 4,865,609 <br />
| style="text-align:right;" | 82,719 <br />
| style="text-align:right;" | 258 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 27,351 <br />
| style="text-align:right;" | 6<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 347 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 119,060,324 <br />
| style="text-align:right;" | 2,812,008 <br />
| style="text-align:right;" | 5248 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 809,099<br />
| style="text-align:right;" | 329 <br />
| style="text-align:right;" | 46 <br />
| style="text-align:right;" | 10,828 <br />
|-<br />
| style="text-align:center;" | 7 <br />
| style="text-align:right;" | 3,195,901,860 <br />
| style="text-align:right;" | 108,329,926<br />
| style="text-align:right;" | 319,617<br />
| style="text-align:right;" | 883,453<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 33,103,848<br />
| style="text-align:right;" | 18,026<br />
| style="text-align:right;" | 1628<br />
| style="text-align:right;" | 435,767<br />
|-<br />
| style="text-align:center;" | 8 <br />
| style="text-align:right;" | 84,998,978,956 <br />
| style="text-align:right;" | 3,523,740,106 <br />
| style="text-align:right;" | 7,187,977 <br />
| style="text-align:right;" | 23,605,205 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 968,981,593 <br />
| style="text-align:right;" | 847,039 <br />
| style="text-align:right;" | 147,215 <br />
| style="text-align:right;" | 9,852,036 <br />
|-<br />
| style="text-align:center;" | 9 <br />
| style="text-align:right;" | 2,439,530,234,167<br />
| style="text-align:right;" | 125,208,536,153 <br />
| style="text-align:right;" | 319,496,827 <br />
| style="text-align:right;" | 1,784,356,000 <br />
| style="text-align:right;" | 17,334,376 <br />
| style="text-align:right;" | 36,095,901,903 <br />
| style="text-align:right;" | 37,101,713 <br />
| style="text-align:right;" | 5,547,231 <br />
| style="text-align:right;" | 400,191,963 <br />
|-<br />
| style="text-align:center;" | 10 <br />
| style="text-align:right;" | 69,352,859,712,417 <br />
|-<br />
| style="text-align:center;" | 11 <br />
| style="text-align:right;" | 2,097,651,003,696,806 <br />
|-<br />
| style="text-align:center;" | 12 <br />
| style="text-align:right;" | 62,854,969,236,701,747 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=38862 Perft(12) count confirmed] by [[Steven Edwards]], [[Computer Chess Forums|CCC]], April 25, 2011</ref> <br />
|-<br />
| style="text-align:center;" | 13 <br />
| style="text-align:right;" | 1,981,066,775,000,396,239 <br />
|-<br />
| style="text-align:center;" | 14 <br />
| style="text-align:right;" | 61,885,021,521,585,529,237 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016</ref> <br />
|-<br />
| style="text-align:center;" | 15 <br />
| style="text-align:right;" | 2,015,099,950,053,364,471,960 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017</ref> <br />
|}<br />
<br />
==Perft 10== <br />
* [[Perft(10) 20 draft 9 Positions]]<br />
* [[Perft(10) 400 draft 8 Positions]]<br />
<br />
==Perft 11== <br />
* [[Perft(11) 20 draft 10 Positions]]<br />
* [[Perft(11) 400 draft 9 Positions]]<br />
<br />
==Perft 12== <br />
* [[Perft(12) 20 draft 11 Positions]]<br />
* [[Perft(12) 400 draft 10 Positions]]<br />
<br />
==Perft 13== <br />
* [[Perft(13) 20 draft 12 Positions]]<br />
* [[Perft(13) 400 draft 11 Positions]]<br />
<br />
==Summary== <br />
* [[Initial Position Summary]]<br />
* [[Perft#15|Perft(15)]]<br />
<span id="kiwipete"></span><br />
=Position 2= <br />
also known as '''Kiwipete''' by [[Peter McKenzie]] <ref>[https://www.stmintz.com/ccc/index.php?id=274926 kiwipete perft position] by [[Russell Reagan]], [[CCC]], January 04, 2003</ref>. The number of double-checks in depth 5 is discussed in [[CCC|Talkchess]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=7&t=78402 Perft statistics - chessprogrammingwiki] by [[Murat Yirci]], [[CCC]], October 10, 2021</ref> and may be 2645 instead of 2637.<br />
<fentt border="double" style="font-size:24pt">r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R</fentt> <br />
r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq - <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks <br />
! Discovery Checks<br />
! Double Checks<br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 48 <br />
| style="text-align:right;" | 8 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 2 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 2039 <br />
| style="text-align:right;" | 351 <br />
| style="text-align:right;" | 1 <br />
| style="text-align:right;" | 91 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 3 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0<br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 97862 <br />
| style="text-align:right;" | 17102 <br />
| style="text-align:right;" | 45 <br />
| style="text-align:right;" | 3162 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 993 <br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 0<br />
| style="text-align:right;" | 1 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 4085603 <br />
| style="text-align:right;" | 757163 <br />
| style="text-align:right;" | 1929 <br />
| style="text-align:right;" | 128013 <br />
| style="text-align:right;" | 15172 <br />
| style="text-align:right;" | 25523<br />
| style="text-align:right;" | 42<br />
| style="text-align:right;" | 6 <br />
| style="text-align:right;" | 43 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 193690690 <br />
| style="text-align:right;" | 35043416 <br />
| style="text-align:right;" | 73365 <br />
| style="text-align:right;" | 4993637 <br />
| style="text-align:right;" | 8392 <br />
| style="text-align:right;" | 3309887 <br />
| style="text-align:right;" | 19883<br />
| style="text-align:right;" | 2637<br />
| style="text-align:right;" | 30171 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 8031647685<br />
| style="text-align:right;" | 1558445089<br />
| style="text-align:right;" | 3577504<br />
| style="text-align:right;" | 184513607<br />
| style="text-align:right;" | 56627920<br />
| style="text-align:right;" | 92238050<br />
| style="text-align:right;" | 568417<br />
| style="text-align:right;" | 54948<br />
| style="text-align:right;" | 360003<br />
|}<br />
<span id="Position3"></span><br />
<br />
=Position 3= <br />
<fentt border="double" style="font-size:24pt">8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8</fentt> <br />
8/2p5/3p4/KP5r/1R3p1k/8/4P1P1/8 w - - <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks<br />
! Discovery Checks<br />
! Double Checks <br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 14 <br />
| style="text-align:right;" | 1 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 2 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 191 <br />
| style="text-align:right;" | 14 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 10 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 2812 <br />
| style="text-align:right;" | 209 <br />
| style="text-align:right;" | 2 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 267 <br />
| style="text-align:right;" | 3 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 43238 <br />
| style="text-align:right;" | 3348 <br />
| style="text-align:right;" | 123 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 1680 <br />
| style="text-align:right;" | 106 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 17 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48811 Impossible perft question] by [[Andy Duplain]], [[CCC]], August 01, 2013</ref> 674624 <br />
| style="text-align:right;" | 52051 <br />
| style="text-align:right;" | 1165 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 52950<br />
| style="text-align:right;" | 1292 <br />
| style="text-align:right;" | 3 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 11030083 <br />
| style="text-align:right;" | 940350 <br />
| style="text-align:right;" | 33325 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 7552 <br />
| style="text-align:right;" | 452473 <br />
| style="text-align:right;" | 26067 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 2733 <br />
|-<br />
| style="text-align:center;" | 7 <br />
| style="text-align:right;" | 178633661 <br />
| style="text-align:right;" | 14519036 <br />
| style="text-align:right;" | 294874 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 140024 <br />
| style="text-align:right;" | 12797406 <br />
| style="text-align:right;" | 370630 <br />
| style="text-align:right;" | 3612 <br />
| style="text-align:right;" | 87 <br />
|-<br />
| style="text-align:center;" | 8 <br />
| style="text-align:right;" | 3009794393<br />
| style="text-align:right;" | 267586558<br />
| style="text-align:right;" | 8009239<br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 6578076<br />
| style="text-align:right;" | 135626805<br />
| style="text-align:right;" | 7181487 <br />
| style="text-align:right;" | 1630 <br />
| style="text-align:right;" | 450410<br />
|}<br />
<span id="Position4"></span><br />
=Position 4= <br />
<fentt border="double" style="font-size:24pt">r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1</fentt> <br />
r3k2r/Pppp1ppp/1b3nbN/nP6/BBP1P3/q4N2/Pp1P2PP/R2Q1RK1 w kq - 0 1<br />
Or mirrored (with the same perft results):<br />
r2q1rk1/pP1p2pp/Q4n2/bbp1p3/Np6/1B3NBn/pPPP1PPP/R3K2R b KQ - 0 1 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
! Captures <br />
! E.p. <br />
! Castles <br />
! Promotions <br />
! Checks <br />
! Checkmates <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 6 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 264 <br />
| style="text-align:right;" | 87 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 6 <br />
| style="text-align:right;" | 48 <br />
| style="text-align:right;" | 10 <br />
| style="text-align:right;" | 0 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 9467 <br />
| style="text-align:right;" | 1021 <br />
| style="text-align:right;" | 4 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 120 <br />
| style="text-align:right;" | 38 <br />
| style="text-align:right;" | 22 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 422333 <br />
| style="text-align:right;" | 131393 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 7795 <br />
| style="text-align:right;" | 60032 <br />
| style="text-align:right;" | 15492 <br />
| style="text-align:right;" | 5 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 15833292 <br />
| style="text-align:right;" | 2046173 <br />
| style="text-align:right;" | 6512 <br />
| style="text-align:right;" | 0 <br />
| style="text-align:right;" | 329464 <br />
| style="text-align:right;" | 200568 <br />
| style="text-align:right;" | 50562 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 706045033 <br />
| style="text-align:right;" | 210369132 <br />
| style="text-align:right;" | 212 <br />
| style="text-align:right;" | 10882006 <br />
| style="text-align:right;" | 81102984 <br />
| style="text-align:right;" | 26973664 <br />
| style="text-align:right;" | 81076 <br />
|}<br />
<span id="Position5"></span><br />
=Position 5= <br />
This position was discussed on [[CCC|Talkchess]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42463 REPORT: wrong perft result by qperft] by [[Jesús Muñoz]], [[CCC]], February 14, 2012</ref> and caught bugs in engines several years old at depth 3 <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=450535&t=42463 Re: REPORT: wrong perft result by qperft] by [[Julien Marcel]], [[CCC]], February 14, 2012</ref> and was also reported wrong here <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42463&start=18 Re: Correction of my error] by [[Eugene Kotlov]], [[CCC]], July 17, 2015</ref>, hopefully now corrected with the results given by [[Steven Edwards]], July 18, 2015 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42463&start=21 Re: Correction of my error] by [[Steven Edwards]], [[CCC]], July 18, 2015</ref><br />
<fentt border="double" style="font-size:24pt">rnbq1k1r/pp1Pbppp/2p5/8/2B5/8/PPP1NnPP/RNBQK2R</fentt> <br />
rnbq1k1r/pp1Pbppp/2p5/8/2B5/8/PPP1NnPP/RNBQK2R w KQ - 1 8 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 44 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 1,486 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 62,379 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 2,103,487 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 89,941,194 <br />
|}<br />
<span id="Position6"></span><br />
=Position 6= <br />
An alternative Perft given by [[Steven Edwards]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48616 An altenative perft() initial FEN] by [[Steven Edwards]], [[CCC]], July 11, 2013</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49118 Some perft() results for that alternative test position] by [[Steven Edwards]], [[CCC]], August 26, 2013</ref><br />
<fentt border="double" style="font-size:24pt">r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1</fentt> <br />
r4rk1/1pp1qppp/p1np1n2/2b1p1B1/2B1P1b1/P1NP1N2/1PP1QPPP/R4RK1 w - - 0 10 <br />
<br />
{| class="wikitable"<br />
|-<br />
! Depth <br />
! Nodes <br />
|-<br />
| style="text-align:center;" | 0 <br />
| style="text-align:right;" | 1 <br />
|-<br />
| style="text-align:center;" | 1 <br />
| style="text-align:right;" | 46 <br />
|-<br />
| style="text-align:center;" | 2 <br />
| style="text-align:right;" | 2,079 <br />
|-<br />
| style="text-align:center;" | 3 <br />
| style="text-align:right;" | 89,890 <br />
|-<br />
| style="text-align:center;" | 4 <br />
| style="text-align:right;" | 3,894,594 <br />
|-<br />
| style="text-align:center;" | 5 <br />
| style="text-align:right;" | 164,075,551 <br />
|-<br />
| style="text-align:center;" | 6 <br />
| style="text-align:right;" | 6,923,051,137 <br />
|-<br />
| style="text-align:center;" | 7 <br />
| style="text-align:right;" | 287,188,994,746 <br />
|-<br />
| style="text-align:center;" | 8 <br />
| style="text-align:right;" | 11,923,589,843,526 <br />
|-<br />
| style="text-align:center;" | 9 <br />
| style="text-align:right;" | 490,154,852,788,714 <br />
|}<br />
<br />
=See also=<br />
* [[Chess960 Perft Results]]<br />
* [[Chinese Chess Perft Results]]<br />
<br />
=Forum Posts= <br />
==2000 ...== <br />
* [https://www.stmintz.com/ccc/index.php?id=274926 kiwipete perft position] by [[Russell Reagan]], [[CCC]], January 04, 2003 » [[Peter McKenzie]], [[Perft Results#kiwipete|Kiwipete]]<br />
* [https://www.stmintz.com/ccc/index.php?id=388806 perft results (how accurate is accurate enough ?)] by [[Roman Hartmann]], [[CCC]], September 23, 2004<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42463 REPORT: wrong perft result by qperft] by [[Jesús Muñoz]], [[CCC]], February 14, 2012 » [[Perft Results#Position5|Position 5]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=45099 Perft and en_passant] by [[Harald Lüßen]], [[CCC]], September 11, 2012 » [[En passant]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47335 Perft(14) estimates thread] by [[Steven Edwards]], [[CCC]], February 26, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47740 Perft(15) estimates thread] by [[Steven Edwards]], [[CCC]], April 10, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48616 An altenative perft() initial FEN] by [[Steven Edwards]], [[CCC]], July 11, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48811 Impossible perft question] by [[Andy Duplain]], [[CCC]], August 01, 2013 » [[Perft Results#Position3|Position 3]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49000 Wide open perft()] by [[Steven Edwards]], [[CCC]], August 18, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53224 Perft(14) revisited] by [[Steven Edwards]], [[CCC]], August 08, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=53406 Perft(14) Weekly Status Report] by [[Steven Edwards]], [[CCC]], August 24, 2014<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54767 Perft(14) verification] by [[Steven Edwards]], [[CCC]], December 28, 2014<br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54853 Perft(14) Weekly Status Reports for 2015] by [[Steven Edwards]], [[CCC]], January 04, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=54995 Perft for various positions] by [[Alexandru Mosoi]], [[CCC]], January 17, 2015<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55274 Some Chess960/FRC positions to be confirmed] by [[Reinhard Scharnagl]], [[CCC]], February 09, 2015 » [[Chess960]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=55787 kiwipete perft position] by [[Henk van den Belt]], [[CCC]], March 26, 2015 » [[Perft Results#kiwipete|Kiwipete]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=58726 Perft(14) Weekly Status Reports for 2016] by [[Steven Edwards]], [[CCC]], December 29, 2015<br />
'''2016'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59781 A perft(7) challenge position] by [[Steven Edwards]], [[CCC]], April 07, 2016 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59818 Another perft(7) challenge position] by [[Steven Edwards]], [[CCC]], April 13, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59915 Perft(7) challenge position #3] by [[Steven Edwards]], [[CCC]], April 20, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59957 Perft(7) challenge position #4] by [[Steven Edwards]], [[CCC]], April 25, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59961 Perft(7) challenge position #5] by [[Steven Edwards]], [[CCC]], April 25, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60102 Another perft(7) challenge] by [[Steven Edwards]], [[CCC]], May 08, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60114 Perft(7) challenge position #6] by [[Steven Edwards]], [[CCC]], May 10, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60242 Perft(7) 64 bit hash mismatch set 8] by [[Steven Edwards]], [[CCC]], May 22, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60942 Twenty-nine perft(7) mismatches from work unit 528] by [[Steven Edwards]], [[CCC]], July 25, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61119 yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], August 13, 2016<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=61119&start=30 Re: yet another attempt on Perft(14)] by [[Ankan Banerjee]], [[CCC]], September 09, 2016 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61329 Two perft(7) mismatches from work unit 571] by [[Steven Edwards]], [[CCC]], September 04, 2016<br />
'''2017 ...'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983 perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017 » [[Perft#15|Perft(15)]]<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=4 Re: perft(15)] by [[Ankan Banerjee]], [[CCC]], August 25, 2017 <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70543 Contrived position for perft] by [[Michael Sherwin]], [[CCC]], April 21, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71379 You gotta love Perft... just not too much!] by [[Martin Bryant]], [[CCC]], July 27, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71841&p=812577 Level 11 Perft statistics] by [[Andreas Øverland]], [[CCC]], September 17, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75877 Place to find correct perft result from a fen position] by [[Elias Nilsson]], [[CCC]], November 20, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76430 Chinese chess Xiangqi perft results] by [[Maksim Korzh]], [[CCC]], January 27, 2021 » [[Chinese Chess Perft Results]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77069 Perft 7 -> 1.6 trillion moves] by [[Michael Byrne|MikeB]], [[CCC]], April 12, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77350 Being silly with perft and legal move generation] by [[Jakob Progsch]], [[CCC]], May 19, 2021 » [[Move Generation#Legal|Legal Move Generation]], [[En passant]]<br />
* [https://talkchess.com/viewtopic.php?t=83392 Perft(16) estimate after averaging MC samples.] by Ajedrecista, [[CCC]], February 26, 2024 Feb 26, 2024<br />
<br />
=External Links= <br />
* [https://oeis.org/A048987 A048987] from [https://en.wikipedia.org/wiki/On-Line_Encyclopedia_of_Integer_Sequences On-Line Encyclopedia of Integer Sequences (OEIS)]<br />
* [https://home.hccnet.nl/h.g.muller/dwnldpage.html µ-Max Dowload Page - qperft] by [[Harm Geert Muller]]<br />
* [https://marcelk.net/rookie/nostalgia/v3/perft-random.epd perft-random.epd] by [[Marcel van Kervinck]]<br />
* [https://craftychess.com/documentation/craftydoc.html Crafty Command Documentation] by [[Robert Hyatt]], see [[Crafty]] perft <depth><br />
* [https://web.archive.org/web/20060430011809/http://www.albert.nu/programs/sharper/perft.htm Sharper - Perft calculation] by [[Albert Bertilsson]] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine])<br />
* [https://web.archive.org/web/20130517080941/http://www.albert.nu/programs/dperft/default.asp Distributed Perft Project] by [[Albert Bertilsson]] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine])<br />
* [http://www.rocechess.ch/perft.html perft, divide, debugging a move generator] from [[ROCE]] by [[Roman Hartmann]]<br />
* [https://sites.google.com/site/numptychess/perft Perft - sample test positions] used by [[Numpty chess]]<br />
* [https://www.stmintz.com/ccc/index.php?terms=perft&search=1 Perft], search the [[Computer Chess Forums|CCC Archives]]<br />
* [https://wismuth.com/chess/statistics-games.html Statistics on chess games] by [[Mathematician#FLabelle|François Labelle]]<br />
* [https://github.com/elcabesa/vajolet/blob/master/tests/perft.txt vajolet/perft.txt at master · elcabesa/vajolet · GitHub] by [[Marco Belli]]<br />
<br />
=References= <br />
<references /><br />
'''[[Perft|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Srdja_Matovic&diff=26922Srdja Matovic2024-03-14T04:03:09Z<p>Smatovic: /* 2020 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[People]] * Srdja Matovic'''<br />
<br />
[[FILE:SrdjaMatovic.jpg|border|right|thumb|220px|link=http://www.inet.haw-hamburg.de/images/srdjamatovic.jpg/view| Srdja Matovic <ref>[http://www.inet.haw-hamburg.de/images/srdjamatovic.jpg/view Portrait Photo Srdja Matovic — Internet Technologies Research Group - INET]</ref> ]] <br />
<br />
'''Srdja Matovic''',<br/><br />
a German born, Montenegrin computer scientist, software developer, and former member of the Internet Technologies Group at the [https://en.wikipedia.org/wiki/Hamburg_University_of_Applied_Sciences Hamburg University of Applied Sciences] <ref>[http://inet.cpt.haw-hamburg.de/members/alumni-1/srdja-matovic Srdja Matovic — Internet Technologies Research Group - INET]</ref>. <br />
As computer chess programmer, Srdja is author of the '''Zeta''' family of chess engines, which are [[Zeta Dva]], a conventional engine written in plain [[C]], [[Zeta]], written in [[OpenCL]], a language suited for [[GPU|GPUs]], and the [[6502]] [https://en.wikipedia.org/wiki/Retro_style retro program] [[Zeta Vintage]] <ref>[https://en.wikipedia.org/wiki/Vintage_%28disambiguation%29 Vintage (disambiguation) from Wikipedia]</ref>. <br />
<br />
=Forum Posts=<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=33315 Zeta, a chess engine in OpenCL] by [[Srdja Matovic]], [[CCC]], March 17, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39332 max amount of moves from a position?] by [[Srdja Matovic]], [[CCC]], June 10, 2011 » [[Chess#Maxima|Chess Maxima]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=40493 LIFO stack based parallel processing?] by [[Srdja Matovic]], [[CCC]], September 22, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=40674 Vintage Chess Programming] by [[Srdja Matovic]], [[CCC]], October 08, 2011 » [[6502]] <br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44165 Help with Best-First Select-Formula] by [[Srdja Matovic]], [[CCC]], June 23, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46424 Nikolachess ... pro gpu solution?] by [[Srdja Matovic]], [[CCC]], December 15, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46568 Qsearch Delta Pruning Rate?] by [[Srdja Matovic]], [[CCC]], December 24, 2012 » [[Delta Pruning]]<br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [https://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
<br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=67102 Draw scores in TT] by [[Srdja Matovic]], [[CCC]], April 14, 2018 » [[Draw]], [[Score#DrawScore|Draw Score]], [[Transposition Table]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67121 LC0 - how to catch up?] by [[Srdja Matovic]], [[CCC]], April 16, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[GPU]], [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69842 WIP, Eta - GPGPU ANN based engine, RFC] by [[Srdja Matovic]], [[CCC]], February 06, 2019<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70504 Google's bfloat for neural networks] by [[Srdja Matovic]], [[CCC]], April 16, 2019 » [[Float]], [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=72684 RMO - Randomized Move Order - yet another Lazy SMP derivate] by [[Srdja Matovic]], [[CCC]], December 30, 2019 » [[Lazy SMP]]<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74607 LC0 vs. NNUE - some tech details...] by [[Srdja Matovic]], [[CCC]], July 29, 2020 » [[Leela Chess Zero#Lc0|Lc0]], [[NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74751 History of Memory Wall in Computer Chess?] by [[Srdja Matovic]], [[CCC]], August 11, 2020 » [[Memory]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[GPU]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75606 Transhuman Chess with NN and RL...] by [[Srdja Matovic]], [[CCC]], October 30, 2020 » [[Neural Networks|NN]], [[Reinforcement Learning|RL]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=7&t=76286 From Esoteric to Transcendental Chess Programming?] by [[Srdja Matovic]], [[CCC]], Januar 12, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]], [[GPU]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=7&t=80364 NNOM++ - Move Ordering Neural Networks?] by [[Srdja Matovic]], [[CCC]], July 24, 2022 <br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=81858 The Next Big Thing in Computer Chess?] by [[Srdja Matovic]], [[CCC]], April 12, 2023 » [[Artificial Intelligence]], [[Programming]], [[Hardware]]<br />
* [https://talkchess.com/viewtopic.php?t=83267 Fruit fly races on steroids?] by [[Srdja Matovic]], [[CCC]], January 29, 2024<br />
<br />
=External Links=<br />
* [https://gitlab.com/smatovic smatovic · GitLab]<br />
* [https://zeta-chess.app26.de/ Zeta Chess Blog]<br />
* [https://eta-chess.app26.de/ Eta Chess Blog]<br />
* [http://www.inet.haw-hamburg.de/members/alumni-1/srdja-matovic Srdja Matovic — Internet Technologies Research Group - INET]<br />
<br />
=References= <br />
<references /><br />
'''[[People|Up one Level]]'''<br />
[[Category:Chess Programmer|Matovic]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26921CCC2024-03-09T07:53:01Z<p>Smatovic: /* 2020 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in its third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], CCC, May 10, 2018<br />
<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
* [https://talkchess.com/viewtopic.php?t=83366 *** TALKCHESS SERVER TO BE SHUT DOWN ***] by [[Harm Geert Muller]], [[CCC]], February 19, 2024<br />
<br />
==2024 ...==<br />
* [https://talkchess.com/viewtopic.php?p=959266#p959266 Re: New forum] by [[Harm Geert Muller]], [[CCC]], Friday Mar 01, 2024<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Main_Page&diff=26920Main Page2024-03-06T06:08:15Z<p>Smatovic: /* Up-To-Date Best Practices */</p>
<hr />
<div>The '''Chess Programming Wiki''' is a repository of information about [[Programming|programming]] computers to play [[Chess|chess]]. Our goal is to provide a reference for every aspect of chess-programming, information about [[:Category:Programmer|programmers]], [[:Category:Researcher|researcher]] and [[Engines|engines]]. You'll find different ways to implement [[Late Move Reductions|LMR]] and [[Bitboards|bitboard]] stuff like [[Best Magics so far|best magics]] for most dense [[Magic Bitboards|magic bitboard]] tables. For didactic purposes, the [[CPW-Engine]] has been developed by some wiki members. You can start browsing using the left-hand navigation bar. All of our content is arranged hierarchically, so you can see every page by following just those links. If you are looking for a specific page or catchword you can use the search box on top.<br />
<br />
CPW was founded by [[Mark Lefler]] on October 26, 2007 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17344&start=4 Re: community test result web page?] by [[Mark Lefler]], [[CCC]], October 26, 2007</ref>, first hosted on [https://en.wikipedia.org/wiki/Wikispaces Wikispaces] <ref>[http://web.archive.org/web/20180216204915/http://chessprogramming.wikispaces.com/ Wikispaces Chessprogramming - home] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine], February 16, 2018)</ref>. Due to that site closure <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66573 Chess Programming Wiki] by [[Jon Dart]], [[CCC]], February 12, 2018</ref>, it moved to its present new host at '''www.chessprogramming.org'''.<br />
<br />
=Up-To-Date Best Practices=<br />
Over time the Zeitgeist of the computer chess community moved on from usenet to bulletin boards to meanwhile chat groups like Discord channels. Computer chess programming is actively discussed in several Discord channels:<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
<br />
=Hot Topics=<br />
Topics people search/discuss much<br />
* [[Stockfish]]<br />
* [[NNUE]]<br />
* [[Leela Chess Zero]]<br />
* [[Syzygy Bases]]<br />
<br />
=Basics=<br />
* [[Getting Started]] - if you are new to chess programming<br />
* [[Board Representation]]<br />
* [[Search]]<br />
* [[Evaluation]]<br />
* [[Opening Book]]<br />
* [[Endgame Tablebases]]<br />
<br />
=Principal Topics=<br />
* [[Chess]] <br />
* [[Programming]]<br />
* [[Artificial Intelligence]]<br />
* [[Knowledge]]<br />
* [[Learning]]<br />
* [[Engine Testing|Testing]]<br />
* [[Automated Tuning|Tuning]]<br />
* [[User Interface]]<br />
* [[Protocols]]<br />
<br />
=Lists=<br />
* [[Cartoons]]<br />
* [[Computer Chess Forums]]<br />
* [[Conferences]]<br />
* [[Dictionary]]<br />
* [[Engines]] including the [[CPW-Engine]]<br />
** [[Dedicated Chess Computers]]<br />
** [[Engine Releases]]<br />
* [[Games]], some other [[Artificial Intelligence|AI]]-Games, where computer chess may borrow some ideas <br />
* [[Hardware]]<br />
* [[History]]<br />
* [[Organizations]]<br />
* [[People]]<br />
* [[Periodical]]<br />
* [[Software]]<br />
* [[Tournaments and Matches]]<br />
<br />
=Miscellaneous=<br />
* [[Acknowledgments]]<br />
* [[Recommended Reading]]<br />
<br />
=Statistics=<br />
* Articles: {{NUMBEROFARTICLES}}<br />
* Pages: {{NUMBEROFPAGES}}<br />
* Files: {{NUMBEROFFILES}}<br />
<br />
=Thanks=<br />
Thanks for visiting our site!<br />
We hope you like the work we have done.<br />
<br />
[[Mark Lefler]] and the rest of the CPW team.<br />
<br />
=References=<br />
<references /><br />
[[Category:Root]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26919CCC2024-03-04T18:54:33Z<p>Smatovic: /* CCC History */ typo</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in its third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], CCC, May 10, 2018<br />
<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
==2024 ...==<br />
* [https://talkchess.com/viewtopic.php?p=959266#p959266 Re: New forum] by [[Harm Geert Muller]], [[CCC]], Friday Mar 01, 2024<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26918CCC2024-03-04T13:03:18Z<p>Smatovic: /* 2024 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], CCC, May 10, 2018<br />
<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
==2024 ...==<br />
* [https://talkchess.com/viewtopic.php?p=959266#p959266 Re: New forum] by [[Harm Geert Muller]], [[CCC]], Friday Mar 01, 2024<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26917CCC2024-03-04T12:55:42Z<p>Smatovic: /* 2010 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], CCC, May 10, 2018<br />
<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
==2024 ...==<br />
* [https://talkchess.com/viewtopic.php?p=959266#p959266 Re: New forum] by [[Harm Geert Mueler]], [[CCC]], Friday Mar 01, 2024<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26916CCC2024-03-04T12:55:16Z<p>Smatovic: /* CCC History */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], CCC, March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
==2024 ...==<br />
* [https://talkchess.com/viewtopic.php?p=959266#p959266 Re: New forum] by [[Harm Geert Mueler]], [[CCC]], Friday Mar 01, 2024<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26915CCC2024-03-04T12:52:44Z<p>Smatovic: /* 2020 ... */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
==2024 ...==<br />
* [https://talkchess.com/viewtopic.php?p=959266#p959266 Re: New forum] by [[Harm Geert Mueler]], [[CCC]], Friday Mar 01, 2024<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26914CCC2024-03-04T12:45:44Z<p>Smatovic: </p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, TC, TalkChess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26913CCC2024-03-04T12:42:58Z<p>Smatovic: /* CCC Links */ updated links</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, Talkchess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [https://www.talkchess.com/ Computer Chess Club (CCC)]<br />
: [https://talkchess.com/viewforum.php?f=2 CCC - General Topics]<br />
: [https://talkchess.com/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [https://talkchess.com/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26912CCC2024-03-04T12:39:35Z<p>Smatovic: /* Discrimination */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, Talkchess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 at latest to February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref> for [https://en.wikipedia.org/wiki/Denial-of-service_attack DDoS] protection. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [http://www.talkchess.com/forum/index.php Computer Chess Club (CCC)]<br />
: [http://www.talkchess.com/forum/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [http://www.talkchess.com/forum/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26911CCC2024-03-04T12:34:52Z<p>Smatovic: /* CCC History */ added second founders</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, Talkchess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 to no later than February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref>. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [http://www.talkchess.com/forum/index.php Computer Chess Club (CCC)]<br />
: [http://www.talkchess.com/forum/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [http://www.talkchess.com/forum/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>. Since February 29, 2024 the forum is in it's third incarnation, lead by a second "founders group" and supported by the community members<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==<span id="Second Founders"></span>Second Founders== <br />
<ref>[https://talkchess.com/viewtopic.php?p=959283#p959283 domain ownership] by [[Harm Geert Muller]], [CCC], March 02, 2024</ref><br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
* [[Harm Geert Muller]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=CCC&diff=26910CCC2024-03-04T12:17:15Z<p>Smatovic: /* Discrimination */ CCC is not IP blocked anymore</p>
<hr />
<div>'''[[Main Page|Home]] * [[Computer Chess Forums]] * Computer Chess Club (CCC)'''<br />
<br />
The '''Computer Chess Club''' (CCC, Talkchess) is a [https://en.wikipedia.org/wiki/Internet_forum#Moderators moderated] computer chess [https://en.wikipedia.org/wiki/Internet_forum forum], established since 1997. <br />
<br />
=Discrimination=<br />
From May 19, 2020 to no later than February 29, 2024, the forum was [https://en.wikipedia.org/wiki/IP_address_blocking blocked] for several [https://en.wikipedia.org/wiki/IP_address IP-ranges] [https://en.wikipedia.org/wiki/Geo-blocking outside the US] <ref>[http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020</ref>. See [http://www.pkoziol.cal24.pl/rodent/talkchess.htm Appeal] by [[Pawel Koziol]].<br />
<br />
=CCC Links= <br />
* [http://www.talkchess.com/forum/index.php Computer Chess Club (CCC)]<br />
: [http://www.talkchess.com/forum/viewforum.php?f=7 CCC - Programming and Technical Discussions]<br />
: [http://www.talkchess.com/forum/viewforum.php?f=6 CCC - Tournaments and Matches]<br />
* [https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013</ref> <ref>[http://www.top-5000.nl/ccc.htm CCC old archives utility] by [[Ed Schroder|Ed Schröder]]</ref> <br />
: [https://www.stmintz.com/ccc/index.php?offset=67575 CCC September/October 1997]<br />
<br />
=CCC History= <br />
CCC was established in October 1997 by the group of "founders", basically to get rid from [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 flaming] in [[Computer Chess Forums|rgcc]] <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997</ref>. The forum was used its own code developed by [[Steven Schwartz]]. Testing of the forum software already occurred in September. CCC has been hosted by IcdChess, [[ChessUSA]] <ref>[http://www.chessusa.com/ Chess Sets from America's Largest Chess Store, Chess Pieces, Boards, & More]</ref> and firstly located at http://www.icdchess.com. Due to the limit of the server, old contents were frequently removed, saved as backups, and stored in other servers, mostly in the FTP server of [[Robert Hyatt]]. In 2006 the forum was changed to use phpBB code (a freeware forum), upgraded the server, and moved to a new domain at http://www.talkchess.com. However, all old contents could not be converted and the forum was continued with a totally new, empty database. [[Sean T Mintz]] had published old contents as an archive one <ref>[https://www.stmintz.com/ccc/ Computer Chess Club Archives] hosted by [http://www.stmintz.com/ Sean T Mintz]</ref>.<br />
<br />
==<span id="Founders"></span>The Founders== <br />
<ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df|computer-chess club]] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref><br />
* [[Moritz Berger]]<br />
* [[Thorsten Czub]]<br />
* [[Dirk Frickenschmidt]]<br />
* [[Robert Hyatt|Bob Hyatt]]<br />
* [[Enrique Irazoqui]]<br />
* [[Andreas Mader]]<br />
* [[Bruce Moreland]]<br />
* [[Peter Schreiner]]<br />
* [[Ed Schroder|Ed Schröder]]<br />
* [[Chris Whittington]]<br />
<br />
==Moderators== <br />
Three moderators were periodically elected, first by the founders <ref>[http://groups.google.com/group/rec.games.chess.computer/msg/9a4cf7fa917de624 Re: From a funny German to Dirk Frickenschmidt (Germany)] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], September 28 1998</ref> , since May 1998 by member voting from a pool of nominees <ref>[http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998</ref>.<br />
{| class="wikitable"<br />
! Announcement<br />
!<br />
!<br />
! Moderators<br />
!<br />
|- <br />
| 1997 <br />
| <ref>[http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=454208&t=42810 Re: Past CCC Moderators] by [[Ed Schroder|Ed Schröder]], CCC, March 09, 2012</ref> <br />
| [[Dirk Frickenschmidt]] <br />
| [[Robert Hyatt]]<br />
| [[Enrique Irazoqui]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=19714 June 01, 1998] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=19714 Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 01, 1998</ref><br />
| [[Amir Ban]]<br />
| [[Bruce Moreland]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=37918 December 30, 1998]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=37918 CCC Moderators Elected!!] by [[Steven Schwartz]], CCC, December 30, 1998</ref> <br />
| [[Peter McKenzie]] <br />
| [[Harald Faber]] <br />
| [[Will Singleton]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=57854 June 24, 1999]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=57854 CCC Moderator Voting Results Now Available...] by [[Steven Schwartz]], CCC, June 24, 1999</ref><br />
| [[Bruce Moreland]] <br />
| [[Fernando Villegas]] <br />
| [[KarinsDad]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=61439 July 22, 1999] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=61439 I resign the Post as Moderator] by [[Fernando Villegas]], July 22, 1999</ref> <br />
| [[Bruce Moreland]] <br />
| [[KarinsDad]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=95753 February 08, 2000]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=95753 CCC Moderator Election Results!] by [[Steven Schwartz]], CCC, February 08, 2000</ref><br />
| [[Dave Gomboc]] <br />
| [[Tom Kerrigan]] <br />
| [[Andrew Williams]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=124751 August 16, 2000] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=124751 CCC Elections: Results are in!!!] by [[Steven Schwartz]], CCC, August 16, 2000</ref><br />
| [[Robert Hyatt]] <br />
| [[Đorđe Vidanović]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=160677 March 28, 2001] <br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=160677 Elections are now over and the winners are....] by [[Steven Schwartz]], CCC, March 28, 2001</ref> <br />
| [[Robert Hyatt]] <br />
| [[Bruce Moreland]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=190802 September 28, 2001]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=190802|CCC Moderator Elections over. Results...] by [[Steven Schwartz]], CCC, September 28, 2001</ref> <br />
| [[Ed Schroder|Ed Schröder]]<br />
| [[John Merlino]]<br />
| [[Uri Blass]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=224162 April 17, 2002]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=224162 Elections Over. And the New Moderators Are...] by [[Steven Schwartz]], CCC, April 17, 2002</ref> <br />
| [[Christophe Théron]] <br />
| [[Robert Hyatt]] <br />
| [[Dann Corbit]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=295830 May 07, 2003]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=295830 CCC Moderator Election Results....] by [[Steven Schwartz]], CCC, May 07, 2003</ref><br />
| [[Dan Andersson]]<br />
| [[Tony Hedlund]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=354167 March 12, 2004]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=354167 CCC Moderator Elections Have Ended. The Winners are....] by [[Steven Schwartz]], CCC, March 12, 2004</ref><br />
| [[Michael Byrne]]<br />
| [[Richard Pijl]]<br />
| [[Russell Reagan]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=413012 February 21, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=413012 I'm back.... Elections and stuff] by [[Steven Schwartz]], CCC, February 21, 2005</ref><br />
| [[John Merlino]]<br />
| [[Dann Corbit]]<br />
| [[Michael Byrne]]<br />
|-<br />
| [http://www.stmintz.com/ccc/index.php?id=452247 September 29, 2005]<br />
| <ref>[http://www.stmintz.com/ccc/index.php?id=452247 CCC and CTF Elections are over. And the winners are......] by [[Steven Schwartz]], CCC, September 29, 2005</ref> <br />
| [[Robert Hyatt]]<br />
| [[Graham Banks]]<br />
| [[Peter Skinner]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=15918 August 27, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=15918 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, August 27, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Albert Silver]]<br />
| [[Thorsten Czub]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=17537 November 02, 2007]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17537 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, November 02, 2007</ref><br />
| [[Graham Banks]]<br />
| [[Ryan Benitez]]<br />
| [[Albert Silver]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=21287 May 22, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=21287 ** CCC Moderator Election--Results **] by TCAdmin, CCC, May 22, 2008</ref><br />
| [[Chris Whittington]]<br />
| [[Thorsten Czub]] <br />
| [[Swaminathan Natarajan]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=25006 November 23, 2008]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=25006 ** Moderator Election - Official Poll **] by TCAdmin, CCC, November 23, 2008</ref><br />
| [[Steve Blincoe]] <br />
| [[Zach Wegner]] <br />
| [[Volker Pittlik]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=28414 June 15, 2009]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=28414 ** Moderator Election - Official Poll **] by TCAdmin, CCC, June 15, 2009</ref> <br />
| [[Dann Corbit]]<br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=31783 January 17, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=31783 ** CCC Moderator Election--Member Voting **] by TCAdmin, CCC, January 17, 2010</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=323300&t=31970 Re: Congratulations to the new moderators] by [[Matthias Gemuh]], CCC, January 24, 2010</ref><br />
| [[Graham Banks]]<br />
| [[Swaminathan Natarajan]]<br />
| [[Jeremy Bernstein]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=34840 Goodbye Talkchess] by [[Jeremy Bernstein]], CCC, June 09, 2010</ref><br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=35376 July 11, 2010]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=35376 ** CCC Moderator Election - Results **] by TCAdmin, CCC, July 11, 2010</ref><br />
| [[Steve Blincoe]]<br />
| [[Robert Hyatt]]<br />
| [[Fernando Villegas]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=37566 January 11, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=37566 ** CCC Moderator Election Poll **] by TCAdmin, CCC, January 11, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=39833 July 24, 2011]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=39833 ** CCC Moderator Election Poll **] by TCAdmin, CCC, July 24, 2011</ref><br />
| [[Fernando Villegas]]<br />
| [[Robert Hyatt]]<br />
| [[Roger Brown]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=42362 Februray 12, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=42362 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, Februray 06, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Miguel A. Ballicora]]<br />
| [[Don Dailey]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=44655 August 08, 2012]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=44655 *** CCC Moderator Election - Member Voting ***] by TCAdmin, CCC, August 01, 2012</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=47478 March 18, 2013]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47478 *** CCC Moderator Election - Member Voting ***] by Sam Hull, CCC, March 11, 2013</ref><br />
| [[Julien Marcel]]<br />
| [[Adam Hair]]<br />
| [[Miguel A. Ballicora]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=51809 April 07, 2014]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=51809 ** CCC Moderator Election--Member Voting **] by Sam Hull, CCC, March 31, 2014</ref><br />
| [[Roger Brown]]<br />
| [[Fabien Letouzey]]<br />
| [[John Merlino]]<br />
|-<br />
| [http://www.talkchess.com/forum/viewtopic.php?t=60564 July 01, 2016]<br />
| <ref>[http://www.talkchess.com/forum/viewtopic.php?t=60564 ** Moderator Election - Final Results **] by Sam Hull, CCC, June 23, 2016</ref><br />
| [[Robert Hyatt]]<br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
|-<br />
| [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 November 10, 2020]<br />
| <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75540 ** Moderator Election Results **] by Sam Hull, CCC, October 25, 2020</ref><br />
| [[Harm Geert Muller]]<br />
| [[Harvey Williamson]]<br />
| [[Dann Corbit]]<br />
|}<br />
<br />
=Forum Posts= <br />
==1997== <br />
* [http://groups.google.com/group/rec.games.chess.computer/msg/f128c68d914ffe39 Re: It got hijacked by the bores] by [[Andreas Mader]], [[Computer Chess Forums|rgcc]], September 5, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/a5c9eaabca1ea41e a new place to discuss (!)] by [[Rolf Tüschen|Rolf Tueschen]], [[Computer Chess Forums|rgcc]], October 8, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/7c5b47d1a27a42df computer-chess club] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], October 10, 1997<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/05debe01992e594d Computer-Chess Club] by [[Steven Schwartz]], [[Computer Chess Forums|rgcc]], October 17, 1997<br />
==1998== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/41a10a903c89a683 Rebuttal to Chris Whittington] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|rgcc]], March 17, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=16547 Double standards on CCC?] by [[Ed Schroder|Ed Schröder]], CCC, April 05, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/5d9596b0d56323fd Computer-Chess Club (CCC) Our 6 Month Anniversary!] by Computer-Chess Club, [[Computer Chess Forums|rgcc]], April 06 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17234 Anonymous posting] by [[Bruce Moreland]], CCC, April 20, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=17865 Important Message from CCC Founder Group!], CCC, May 01, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=18712 Moderator Nominees, The Vote, My Personal Thoughts...] by [[Steven Schwartz]], CCC, May 15, 1998<br />
* [http://groups.google.com/group/rec.games.chess.misc/browse_frm/thread/a4d26edb548bd78a Talk Chess Computers - Join CCC (Free)] by Computer-Chess Club, [[Computer Chess Forums|rgcm]], June 5, 1998<br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/79a90bb9548486fe Formal Appeal to CCC Moderators] by [[Chris Whittington]], [[Computer Chess Forums|rgcc]], August 16, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=27390 Moderation policy] by [[Don Beal]], CCC, September 21, 1998<br />
* [http://www.stmintz.com/ccc/index.php?id=35795 Moderator questions] by [[Bruce Moreland]], CCC, December 12, 1998<br />
==1999== <br />
* [http://groups.google.com/group/rec.games.chess.computer/browse_frm/thread/1f7be8122a1a672a final post] by [[Robert Hyatt]], [[Computer Chess Forums|rgcc]], March 25, 1999<br />
==2000 ...== <br />
* [http://www.stmintz.com/ccc/index.php?id=117002 Hans Gerber is suspended--thanks for your understanding] by [[Tom Kerrigan]], CCC, June 29, 2000<br />
* [http://www.talkchess.com/forum/viewtopic.php?p=156633 Explaination of the TCadmin account] by Quentin Turner, CCC, November 02, 2007<br />
==2010 ...== <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=35117 A Question for Our Sponsor..IPPO Links OK or Not?] by [[Steve Blincoe|Steve B]], CCC, June 24, 2010 » [[Ippolit]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=49253 CCC old archives utility] by [[Ed Schroder]], CCC, September 05, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=59478 This forum is 10 years old] by James Constance, CCC, March 11, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60247 To MODERATORS: Please STOP moving genuine tournaments!] by [[Torsten Schoop|Dr. Torsten Schoop]], CCC, May 23, 2016 » [[Tournaments and Matches]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66849 Thank You Your Move Chess & Games] by [[Mark Lefler]], CCC, March 16, 2018 » [[ChessUSA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=5&t=67407 Real names] by [[Martin Sedlak]], [[CCC]], May 10, 2018<br />
==2020 ...==<br />
* [http://hiarcs.net/forums/viewtopic.php?t=9998 talkchess not accessible from France] by [[Vivien Clauzon]], [[Computer Chess Forums|Hiarcs Forum]], May 19, 2020<br />
* [http://www.open-chess.org/viewtopic.php?f=3&t=3238 Talkchess forum is inaccessible from India] by P. Kumar, [[Computer Chess Forums|OpenChess Forum]], May 21, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74393 Polish users cut off from TalkChess] by [[Dann Corbit]], [[CCC]], July 06, 2020<br />
* [https://prodeo.actieforum.com/t118-my-statement-from-rodent-homepage My statement from Rodent homepage] by [[Pawel Koziol]], [[Computer Chess Forums|ProDeo Forum]], December 05, 2020<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78120 Task force TalkChess access] by [[Harm Geert Muller]], [[CCC]], September 06, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=78174 On the ownership of TakChess] by [[Harm Geert Muller]], [[CCC]], September 12, 2021<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82146 Crowd funding initiative to get our hands on the talkchess domain name] by [[Rebel]], [[CCC]], June 07, 2023<br />
<br />
=External Links= <br />
* [https://de-de.facebook.com/Talkchess Talkchess | Facebook]<br />
<br />
=References= <br />
<references /><br />
'''[[Computer Chess Forums|Up one level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Computer_Chess_Forums&diff=26908Computer Chess Forums2024-02-22T08:20:08Z<p>Smatovic: /* Endgame Tablebases */ updated CCRL EGTB</p>
<hr />
<div>'''[[Main Page|Home]] * Computer Chess Forums'''<br />
<br />
[[FILE:Forum romanum 6k (5760x2097).jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Roman_Forum The Forum Romanum] in [https://en.wikipedia.org/wiki/Rome Rome] <ref>HDR panoramic view out of 9 pictures (3 exposures at 3 different angles). [https://commons.wikimedia.org/wiki/File:Forum_romanum_6k_(5760x2097).jpg Picture] taken from the [https://en.wikipedia.org/wiki/Capitoline_Museums Capitoline Museums] by [https://commons.wikimedia.org/wiki/User:BeBo86 BeBo86], June 02, 2012, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
<br />
=Chess Programming Forums= <br />
Some of the more popular computer chess programming related [https://en.wikipedia.org/wiki/Internet_forum forums]:<br />
* [https://banksiagui.com/forums/ BanksiaGUI Forum]<br />
* [[CCC]] Computer Chess Club<br />
* [http://www.open-chess.org/ OpenChess - Independent Computer Chess Discussion Forum]<br />
* [http://www.open-aurec.com/wbforum/ Winboard Forum]<br />
: [http://www.open-aurec.com/wbforum/viewforum.php?f=4 Winboard Forum - Programming and Technical Discussions] (Winboard Programming Forum)<br />
: [http://www.open-aurec.com/wbforum/viewforum.php?f=18 Winboard Forum - Archive (Old Parsimony Forum)]<br />
* [https://www.reddit.com/r/chessprogramming/ Reddit chess programming]<br />
* [https://www.reddit.com/r/ComputerChess/ Reddit computer chess in general]<br />
<br />
=Discord Channels= <br />
Computer chess is actively discussed in several Discord channels:<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
<br />
=Engine Forums= <br />
Some chess programs also have forums:<br />
* [https://groups.google.com/forum/#!forum/fishcooking FishCooking - Google Groups] » [[Stockfish]]<br />
* [https://groups.google.com/g/gnu.chess gnu.chess] » [[GNU Chess]]<br />
* [http://www.hiarcs.net/forums/ HIARCS Forum] » [[HIARCS]]<br />
* [https://groups.google.com/forum/#!forum/lczero LCZero – Google Groups] » [[Leela Chess Zero]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/forum_show.pl Rybka Forum] » [[Rybka]] (Closed since September 01, 2021)<br />
* [https://prodeo.actieforum.com/ ProDeo Forum] hosted and moderated by [[Ed Schroder]] » [[ProDeo]]<br />
<br />
=Chess Computers= <br />
* [https://www.schachcomputer.info/forum/ Schachcomputer.info Community] (mostly German)<br />
: [https://www.schachcomputer.info/forum/forumdisplay.php?f=52 Oldie & Retro Schachprogramme]<br />
<br />
=Endgame Tablebases= <br />
* [https://biokirr.com/Chess/CCRL-Forum/viewforum.php?f=6 CCRL Discussion Board - Endgame Tablebases]<br />
<br />
=Rating Lists= <br />
* [https://biokirr.com/Chess/CCRL-Forum/ CCRL] Discussion Board<br />
* [[WBEC|WBEC-Ridderkerk]] hosted by [[Leo Dijksman]]<br />
** [http://wbec-ridderkerk.forumotion.com/ WBEC-Ridderkerk forum]<br />
<br />
=News groups= <br />
Unmoderated [https://en.wikipedia.org/wiki/Google_Groups newsgroups] contain a lot of [https://en.wikipedia.org/wiki/Newsgroup_spam spam], but also a lot of valuable posts of the pre-CCC area in the archives:<br />
* [http://groups.google.com/group/rec.games.chess.computer/topics rec.games.chess.computer] (r.g.c.c, rgcc)<br />
* [http://groups.google.com/group/rec.games.chess.misc/topics rec.games.chess.misc]<br />
* [http://groups.google.com/group/rec.games.chess/topics rec.games.chess archive]<br />
<br />
=Game & AI Forums= <br />
* [https://www.game-ai-forum.org/index.php Game-AI Forum]<br />
* [https://groups.google.com/forum/#!forum/comp.sources.games comp.sources.games]<br />
* [https://groups.google.com/group/rec.games.abstract/topics rec.games.abstract]<br />
* [https://groups.google.com/group/rec.games.board rec.games.board]<br />
* [https://groups.google.com/forum/#!forum/rec.games.chinese-chess rec.games.chinese-chess] » [[Chinese Chess]]<br />
* [https://groups.google.com/forum/#!forum/rec.games.programmer rec.games.programmer]<br />
* [https://groups.google.com/forum/#!forum/shogi-l SHOGI-L]<br />
<br />
=National Forums= <br />
* [http://forum.computerschach.de/ CSS-Forum] (German)<br />
* [http://www.g-sei.org/?post_type=forum Forum « G 6] (Italian)<br />
* [http://kasparovchess.crestbook.com/forums/13/ KasparovChess] (Russian)<br />
* [https://forchess.ru/forumdisplay.php?f=24 Forchess] (Russian)<br />
* [http://www.foro.meca-web.es/index.php Meca Foro] (Spanish)<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Forum Forum from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Internet_forum Internet Forum from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Computer-mediated_communication Computer-mediated communication from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Electronic_mailing_list Electronic mailing list from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Social_networking_service Social networking service from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Usenet_newsgroup Usenet newsgroup from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Google_Groups Google Groups from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PhpBB phpBB from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tapatalk Tapatalk from Wikipedia]<br />
==Social and Ethic Aspects== <br />
* [https://en.wikipedia.org/wiki/Anonymity Anonymity from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Cyberculture Cyberculture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Ethics Ethics from Wikipedia]<br />
* [http://www.jeanweber.com/newsite/?page_id=22 Ethics in scientific and technical communication]] by [http://www.oreillynet.com/pub/au/1899 Jean Hollis Weber], [http://www.wisenet-australia.org/WISENET Journal] 38, July 1995, pp. 2-4<br />
* [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 Flaming (Internet) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Forum_spam Forum spam from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Godwin%27s_law Godwin's law from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Newsgroup_spam Newsgroup spam from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Socialization Socialization from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Troll_%28Internet%29 Troll (Internet) from Wikipedia]<br />
<br />
==Language and Rhetoric Aspects== <br />
* [https://en.wikipedia.org/wiki/Ad_hominem Ad hominem from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Fallacy Fallacy from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Internet_slang Internet slang from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Irony Irony from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_fallacies List of fallacies from Wikipedia]<br />
* [http://translationjournal.net/journal/38fallacies.htm Logical Fallacies and Ethics in Everyday Language] by [http://translationjournal.net/journal/38fallacies.htm Elena Sgarbossa]<br />
* [https://en.wikipedia.org/wiki/Metaphor Metaphor from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Non_sequitur_%28logic%29 Non sequitur (logic) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Red_herring Red herring from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Rhetoric Rhetoric from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Rhetorical_device Rhetorical device from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Sarcasm Sarcasm from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Syllogism Syllogism from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Syllogistic_fallacy Syllogistic fallacy from Wikipedia]<br />
<br />
=Forum Posts= <br />
==2020 ...==<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82047 Time to immigrate to another forum?] by [[Rebel]], May 15, 2023<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82879 The Winboard Forum is down forever] by [[Volker Pittlik]], November 18, 2023<br />
<br />
=References= <br />
<references /><br />
'''[[Main Page|Up one Level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Computer_Chess_Forums&diff=26907Computer Chess Forums2024-02-22T08:19:23Z<p>Smatovic: /* Rating Lists */ updated CCRL</p>
<hr />
<div>'''[[Main Page|Home]] * Computer Chess Forums'''<br />
<br />
[[FILE:Forum romanum 6k (5760x2097).jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Roman_Forum The Forum Romanum] in [https://en.wikipedia.org/wiki/Rome Rome] <ref>HDR panoramic view out of 9 pictures (3 exposures at 3 different angles). [https://commons.wikimedia.org/wiki/File:Forum_romanum_6k_(5760x2097).jpg Picture] taken from the [https://en.wikipedia.org/wiki/Capitoline_Museums Capitoline Museums] by [https://commons.wikimedia.org/wiki/User:BeBo86 BeBo86], June 02, 2012, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
<br />
=Chess Programming Forums= <br />
Some of the more popular computer chess programming related [https://en.wikipedia.org/wiki/Internet_forum forums]:<br />
* [https://banksiagui.com/forums/ BanksiaGUI Forum]<br />
* [[CCC]] Computer Chess Club<br />
* [http://www.open-chess.org/ OpenChess - Independent Computer Chess Discussion Forum]<br />
* [http://www.open-aurec.com/wbforum/ Winboard Forum]<br />
: [http://www.open-aurec.com/wbforum/viewforum.php?f=4 Winboard Forum - Programming and Technical Discussions] (Winboard Programming Forum)<br />
: [http://www.open-aurec.com/wbforum/viewforum.php?f=18 Winboard Forum - Archive (Old Parsimony Forum)]<br />
* [https://www.reddit.com/r/chessprogramming/ Reddit chess programming]<br />
* [https://www.reddit.com/r/ComputerChess/ Reddit computer chess in general]<br />
<br />
=Discord Channels= <br />
Computer chess is actively discussed in several Discord channels:<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
<br />
=Engine Forums= <br />
Some chess programs also have forums:<br />
* [https://groups.google.com/forum/#!forum/fishcooking FishCooking - Google Groups] » [[Stockfish]]<br />
* [https://groups.google.com/g/gnu.chess gnu.chess] » [[GNU Chess]]<br />
* [http://www.hiarcs.net/forums/ HIARCS Forum] » [[HIARCS]]<br />
* [https://groups.google.com/forum/#!forum/lczero LCZero – Google Groups] » [[Leela Chess Zero]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/forum_show.pl Rybka Forum] » [[Rybka]] (Closed since September 01, 2021)<br />
* [https://prodeo.actieforum.com/ ProDeo Forum] hosted and moderated by [[Ed Schroder]] » [[ProDeo]]<br />
<br />
=Chess Computers= <br />
* [https://www.schachcomputer.info/forum/ Schachcomputer.info Community] (mostly German)<br />
: [https://www.schachcomputer.info/forum/forumdisplay.php?f=52 Oldie & Retro Schachprogramme]<br />
<br />
=Endgame Tablebases= <br />
* [http://kirill-kryukov.com/chess/discussion-board/viewforum.php?f=6 CCRL Discussion Board - Endgame Tablebases]<br />
<br />
=Rating Lists= <br />
* [https://biokirr.com/Chess/CCRL-Forum/ CCRL] Discussion Board<br />
* [[WBEC|WBEC-Ridderkerk]] hosted by [[Leo Dijksman]]<br />
** [http://wbec-ridderkerk.forumotion.com/ WBEC-Ridderkerk forum]<br />
<br />
=News groups= <br />
Unmoderated [https://en.wikipedia.org/wiki/Google_Groups newsgroups] contain a lot of [https://en.wikipedia.org/wiki/Newsgroup_spam spam], but also a lot of valuable posts of the pre-CCC area in the archives:<br />
* [http://groups.google.com/group/rec.games.chess.computer/topics rec.games.chess.computer] (r.g.c.c, rgcc)<br />
* [http://groups.google.com/group/rec.games.chess.misc/topics rec.games.chess.misc]<br />
* [http://groups.google.com/group/rec.games.chess/topics rec.games.chess archive]<br />
<br />
=Game & AI Forums= <br />
* [https://www.game-ai-forum.org/index.php Game-AI Forum]<br />
* [https://groups.google.com/forum/#!forum/comp.sources.games comp.sources.games]<br />
* [https://groups.google.com/group/rec.games.abstract/topics rec.games.abstract]<br />
* [https://groups.google.com/group/rec.games.board rec.games.board]<br />
* [https://groups.google.com/forum/#!forum/rec.games.chinese-chess rec.games.chinese-chess] » [[Chinese Chess]]<br />
* [https://groups.google.com/forum/#!forum/rec.games.programmer rec.games.programmer]<br />
* [https://groups.google.com/forum/#!forum/shogi-l SHOGI-L]<br />
<br />
=National Forums= <br />
* [http://forum.computerschach.de/ CSS-Forum] (German)<br />
* [http://www.g-sei.org/?post_type=forum Forum « G 6] (Italian)<br />
* [http://kasparovchess.crestbook.com/forums/13/ KasparovChess] (Russian)<br />
* [https://forchess.ru/forumdisplay.php?f=24 Forchess] (Russian)<br />
* [http://www.foro.meca-web.es/index.php Meca Foro] (Spanish)<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Forum Forum from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Internet_forum Internet Forum from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Computer-mediated_communication Computer-mediated communication from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Electronic_mailing_list Electronic mailing list from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Social_networking_service Social networking service from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Usenet_newsgroup Usenet newsgroup from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Google_Groups Google Groups from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PhpBB phpBB from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tapatalk Tapatalk from Wikipedia]<br />
==Social and Ethic Aspects== <br />
* [https://en.wikipedia.org/wiki/Anonymity Anonymity from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Cyberculture Cyberculture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Ethics Ethics from Wikipedia]<br />
* [http://www.jeanweber.com/newsite/?page_id=22 Ethics in scientific and technical communication]] by [http://www.oreillynet.com/pub/au/1899 Jean Hollis Weber], [http://www.wisenet-australia.org/WISENET Journal] 38, July 1995, pp. 2-4<br />
* [https://en.wikipedia.org/wiki/Flaming_%28Internet%29 Flaming (Internet) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Forum_spam Forum spam from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Godwin%27s_law Godwin's law from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Newsgroup_spam Newsgroup spam from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Socialization Socialization from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Troll_%28Internet%29 Troll (Internet) from Wikipedia]<br />
<br />
==Language and Rhetoric Aspects== <br />
* [https://en.wikipedia.org/wiki/Ad_hominem Ad hominem from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Fallacy Fallacy from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Internet_slang Internet slang from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Irony Irony from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_fallacies List of fallacies from Wikipedia]<br />
* [http://translationjournal.net/journal/38fallacies.htm Logical Fallacies and Ethics in Everyday Language] by [http://translationjournal.net/journal/38fallacies.htm Elena Sgarbossa]<br />
* [https://en.wikipedia.org/wiki/Metaphor Metaphor from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Non_sequitur_%28logic%29 Non sequitur (logic) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Red_herring Red herring from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Rhetoric Rhetoric from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Rhetorical_device Rhetorical device from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Sarcasm Sarcasm from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Syllogism Syllogism from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Syllogistic_fallacy Syllogistic fallacy from Wikipedia]<br />
<br />
=Forum Posts= <br />
==2020 ...==<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82047 Time to immigrate to another forum?] by [[Rebel]], May 15, 2023<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82879 The Winboard Forum is down forever] by [[Volker Pittlik]], November 18, 2023<br />
<br />
=References= <br />
<references /><br />
'''[[Main Page|Up one Level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Main_Page&diff=26906Main Page2024-02-21T05:27:14Z<p>Smatovic: /* Up-To-Date Best Practices */</p>
<hr />
<div>The '''Chess Programming Wiki''' is a repository of information about [[Programming|programming]] computers to play [[Chess|chess]]. Our goal is to provide a reference for every aspect of chess-programming, information about [[:Category:Programmer|programmers]], [[:Category:Researcher|researcher]] and [[Engines|engines]]. You'll find different ways to implement [[Late Move Reductions|LMR]] and [[Bitboards|bitboard]] stuff like [[Best Magics so far|best magics]] for most dense [[Magic Bitboards|magic bitboard]] tables. For didactic purposes, the [[CPW-Engine]] has been developed by some wiki members. You can start browsing using the left-hand navigation bar. All of our content is arranged hierarchically, so you can see every page by following just those links. If you are looking for a specific page or catchword you can use the search box on top.<br />
<br />
CPW was founded by [[Mark Lefler]] on October 26, 2007 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17344&start=4 Re: community test result web page?] by [[Mark Lefler]], [[CCC]], October 26, 2007</ref>, first hosted on [https://en.wikipedia.org/wiki/Wikispaces Wikispaces] <ref>[http://web.archive.org/web/20180216204915/http://chessprogramming.wikispaces.com/ Wikispaces Chessprogramming - home] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine], February 16, 2018)</ref>. Due to that site closure <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66573 Chess Programming Wiki] by [[Jon Dart]], [[CCC]], February 12, 2018</ref>, it moved to its present new host at '''www.chessprogramming.org'''.<br />
<br />
=Up-To-Date Best Practices=<br />
Over time the Zeitgeist of the computer chess community moved on from usenet to bulletin boards to meanwhile chat groups like Discord channels. Computer chess programming is actively discussed in several Discord channels:<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
* [https://eta-chess.app26.de/post/discord-channels/ Discord Channels] by [[Srdja Matovic]] on Eta Chess blog, February 21, 2024<br />
<br />
=Hot Topics=<br />
Topics people search/discuss much<br />
* [[Stockfish]]<br />
* [[NNUE]]<br />
* [[Leela Chess Zero]]<br />
* [[Syzygy Bases]]<br />
<br />
=Basics=<br />
* [[Getting Started]] - if you are new to chess programming<br />
* [[Board Representation]]<br />
* [[Search]]<br />
* [[Evaluation]]<br />
* [[Opening Book]]<br />
* [[Endgame Tablebases]]<br />
<br />
=Principal Topics=<br />
* [[Chess]] <br />
* [[Programming]]<br />
* [[Artificial Intelligence]]<br />
* [[Knowledge]]<br />
* [[Learning]]<br />
* [[Engine Testing|Testing]]<br />
* [[Automated Tuning|Tuning]]<br />
* [[User Interface]]<br />
* [[Protocols]]<br />
<br />
=Lists=<br />
* [[Cartoons]]<br />
* [[Computer Chess Forums]]<br />
* [[Conferences]]<br />
* [[Dictionary]]<br />
* [[Engines]] including the [[CPW-Engine]]<br />
** [[Dedicated Chess Computers]]<br />
** [[Engine Releases]]<br />
* [[Games]], some other [[Artificial Intelligence|AI]]-Games, where computer chess may borrow some ideas <br />
* [[Hardware]]<br />
* [[History]]<br />
* [[Organizations]]<br />
* [[People]]<br />
* [[Periodical]]<br />
* [[Software]]<br />
* [[Tournaments and Matches]]<br />
<br />
=Miscellaneous=<br />
* [[Acknowledgments]]<br />
* [[Recommended Reading]]<br />
<br />
=Statistics=<br />
* Articles: {{NUMBEROFARTICLES}}<br />
* Pages: {{NUMBEROFPAGES}}<br />
* Files: {{NUMBEROFFILES}}<br />
<br />
=Thanks=<br />
Thanks for visiting our site!<br />
We hope you like the work we have done.<br />
<br />
[[Mark Lefler]] and the rest of the CPW team.<br />
<br />
=References=<br />
<references /><br />
[[Category:Root]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Main_Page&diff=26905Main Page2024-02-21T05:26:44Z<p>Smatovic: /* Up-To-Date Best Practices */</p>
<hr />
<div>The '''Chess Programming Wiki''' is a repository of information about [[Programming|programming]] computers to play [[Chess|chess]]. Our goal is to provide a reference for every aspect of chess-programming, information about [[:Category:Programmer|programmers]], [[:Category:Researcher|researcher]] and [[Engines|engines]]. You'll find different ways to implement [[Late Move Reductions|LMR]] and [[Bitboards|bitboard]] stuff like [[Best Magics so far|best magics]] for most dense [[Magic Bitboards|magic bitboard]] tables. For didactic purposes, the [[CPW-Engine]] has been developed by some wiki members. You can start browsing using the left-hand navigation bar. All of our content is arranged hierarchically, so you can see every page by following just those links. If you are looking for a specific page or catchword you can use the search box on top.<br />
<br />
CPW was founded by [[Mark Lefler]] on October 26, 2007 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17344&start=4 Re: community test result web page?] by [[Mark Lefler]], [[CCC]], October 26, 2007</ref>, first hosted on [https://en.wikipedia.org/wiki/Wikispaces Wikispaces] <ref>[http://web.archive.org/web/20180216204915/http://chessprogramming.wikispaces.com/ Wikispaces Chessprogramming - home] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine], February 16, 2018)</ref>. Due to that site closure <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66573 Chess Programming Wiki] by [[Jon Dart]], [[CCC]], February 12, 2018</ref>, it moved to its present new host at '''www.chessprogramming.org'''.<br />
<br />
=Up-To-Date Best Practices=<br />
Over time the Zeitgeist of the computer chess community moved on from usenet to bulletin boards to meanwhile chat groups like Discord channels. Computer chess programming is actively discussed in several Discord channels:<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
* [https://eta-chess.app26.de/post/discord-channels/ Discord Channels] by [[Srdja Matovic]] on Eta Chess Blog, February 21, 2024<br />
<br />
=Hot Topics=<br />
Topics people search/discuss much<br />
* [[Stockfish]]<br />
* [[NNUE]]<br />
* [[Leela Chess Zero]]<br />
* [[Syzygy Bases]]<br />
<br />
=Basics=<br />
* [[Getting Started]] - if you are new to chess programming<br />
* [[Board Representation]]<br />
* [[Search]]<br />
* [[Evaluation]]<br />
* [[Opening Book]]<br />
* [[Endgame Tablebases]]<br />
<br />
=Principal Topics=<br />
* [[Chess]] <br />
* [[Programming]]<br />
* [[Artificial Intelligence]]<br />
* [[Knowledge]]<br />
* [[Learning]]<br />
* [[Engine Testing|Testing]]<br />
* [[Automated Tuning|Tuning]]<br />
* [[User Interface]]<br />
* [[Protocols]]<br />
<br />
=Lists=<br />
* [[Cartoons]]<br />
* [[Computer Chess Forums]]<br />
* [[Conferences]]<br />
* [[Dictionary]]<br />
* [[Engines]] including the [[CPW-Engine]]<br />
** [[Dedicated Chess Computers]]<br />
** [[Engine Releases]]<br />
* [[Games]], some other [[Artificial Intelligence|AI]]-Games, where computer chess may borrow some ideas <br />
* [[Hardware]]<br />
* [[History]]<br />
* [[Organizations]]<br />
* [[People]]<br />
* [[Periodical]]<br />
* [[Software]]<br />
* [[Tournaments and Matches]]<br />
<br />
=Miscellaneous=<br />
* [[Acknowledgments]]<br />
* [[Recommended Reading]]<br />
<br />
=Statistics=<br />
* Articles: {{NUMBEROFARTICLES}}<br />
* Pages: {{NUMBEROFPAGES}}<br />
* Files: {{NUMBEROFFILES}}<br />
<br />
=Thanks=<br />
Thanks for visiting our site!<br />
We hope you like the work we have done.<br />
<br />
[[Mark Lefler]] and the rest of the CPW team.<br />
<br />
=References=<br />
<references /><br />
[[Category:Root]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26904GPU2024-01-24T08:34:03Z<p>Smatovic: /* 2020 ... */ link to Chinese gpus</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are in main four ways how to use a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch Stockfish NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training]<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set Architecture]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set Architecture]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf GCN3/4 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=7&t=72566&p=955538#p955538 Re: China boosts in silicon...] by [[Srdja Matovic]], [[CCC]], January 13, 2024<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26903GPU2024-01-24T08:15:45Z<p>Smatovic: /* GPU in Computer Chess */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are in main four ways how to use a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch Stockfish NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training]<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set Architecture]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set Architecture]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf GCN3/4 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Talk:GPU&diff=26902Talk:GPU2024-01-14T16:22:15Z<p>Smatovic: /* AMD architectures */</p>
<hr />
<div>== AMD architectures ==<br />
<br />
My own conclusions are:<br />
<br />
* TeraScale has VLIW design.<br />
* GCN has 16 wide SIMD, executing a Wavefront of 64 threads over 4 cycles.<br />
* RDNA has 32 wide SIMD, executing a Wavefront:32 over 1 cycle and Wavefront:64 over two cycles.<br />
* CDNA is advanced GCN<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 17:22, 14 January 2024 (CET)<br />
<br />
== Nvidia architectures ==<br />
<br />
Afaik Nvidia did never official mention SIMD in their papers as hardware architecture, with Tesla they only referred to as SIMT.<br />
<br />
Nevertheless, my own conclusions are:<br />
<br />
* Tesla has 8 wide SIMD, executing a Warp of 32 threads over 4 cycles.<br />
<br />
* Fermi has 16 wide SIMD, executing a Warp of 32 threads over 2 cycles.<br />
<br />
* Kepler is somehow odd, not sure how the compute units are partitioned.<br />
<br />
* Maxwell and Pascal have 32 wide SIMD, executing a Warp of 32 threads over 1 cycle.<br />
<br />
* Volta and Turing seem to have 16 wide FPU SIMDs, but my own experiments show 32 wide VALU.<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:17, 22 April 2021 (CEST)<br />
<br />
== SIMD + Scalar Unit ==<br />
<br />
It seems every SIMD unit has one scalar unit on GPU architectures, executing control flow (branches, loops) or special functions the SIMD ALUs are not capable of.<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 15:18, 4 January 2023 (CET)<br />
<br />
== embedded CPU controller ==<br />
<br />
It is not documented in the whitepapers, but it seems that every discrete GPU has an embedded CPU controller (e.g. Nvidia Falcon) who (speculation) launches the kernels.<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 10:36, 22 April 2021 (CEST)<br />
<br />
== GPUs and Duncan's taxonomy ==<br />
It is not clear to me how the underlying hardware of GPU SIMD units of architectures with unified shader architecture is realized by different vendors, there is the concept of bit-sliced ALUs, there is the concept of pipelined vector processors, there is the concept of SIMD units with fix bit-width ALUs. The white papers from different vendors leave room for speculation, the different instruction throughputs for higher precision and lower precision too, what is left to the programmer is to do microbenchmarking and make conclusions on their own.<br />
<br />
https://en.wikipedia.org/wiki/Duncan%27s_taxonomy<br />
<br />
https://en.wikipedia.org/wiki/Flynn%27s_taxonomy<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 13:58, 16 December 2021 (CET)<br />
<br />
== CPW GPU article ==<br />
<br />
A suggestion of mine, keep this GPU article as an generalized overview of GPUs, with incremental updates for different frameworks and architectures. GPUs and GPGPU is a moving target with different platforms offering new feature sets, better open own articles for things like GPGPU, SIMT, CUDA, ROCm, oneAPI, Metal or simply link to Wikipedia containing the newest specs and infos.<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 21:29, 27 April 2021 (CEST)<br />
<br />
== GPGPU architectures ==<br />
Regarding GPGPU architectures or frameworks, a link to the architecture white paper, instruction set architecture, programming guide, and link to Wikipedia with a list of the concrete models with specs would be nice, if available.<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 09:21, 25 October 2021 (CEST)<br />
<br />
== Legacy GPGPU ==<br />
<br />
This article does not cover legacy, pre 2007, GPGPU methods, how to use pixel, vertex, geometry, tessellation and compute shaders via OpenGL or DirectX for GPGPU. I can imagine it is possible to backport a neural network Lc0 backend to a certain DirextX/OpenGL API, but I doubt it has real contemporary relevance (running Lc0 on an SGI Indy or alike).<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 14:09, 14 November 2022 (CET)<br />
<br />
== Alternative Architectures ==<br />
<br />
There was for example the IBM PowerXCell 8i, used in the IBM Roadrunner super-computer from 2008, the first heterogeneous petaFLOP, a smaller version ran in the PlayStation 3:<br />
<br />
https://en.wikipedia.org/wiki/Cell_%28processor%29#PowerXCell_8i<br />
<br />
There was the Intel Larrabee project, a lot of simple x64 cores with AVX-512 vector unit from 2010, later released as Xeon Phi accelerator:<br />
<br />
https://en.wikipedia.org/wiki/Larrabee_%28microarchitecture%29<br />
<br />
https://en.wikipedia.org/wiki/Xeon_Phi<br />
<br />
There is still the NEC SX Aurora (>=2017), a vector-processor on a PCIe card, descendant from the NEC SX super-computer series as used e.g. in the Earth Simulator super-computer:<br />
<br />
https://en.wikipedia.org/wiki/NEC_SX-Aurora_TSUBASA<br />
<br />
There is the Chinese Matrix 2000/3000 many-core accelerator (>=2017), used in the Tianhe super-computer:<br />
<br />
https://en.wikichip.org/wiki/nudt/matrix-2000<br />
<br />
AFAIK, none of the above was used to play computer chess....on the other side:<br />
<br />
IBM Deep Blue used ASICs:<br />
https://www.chessprogramming.org/Deep_Blue<br />
<br />
Hydra used FPGAs:<br />
https://www.chessprogramming.org/Hydra<br />
<br />
AlphaZero used TPUs:<br />
https://www.chessprogramming.org/AlphaZero<br />
<br />
<br />
[[User:Smatovic|Smatovic]] ([[User talk:Smatovic|talk]]) 08:04, 22 September 2023 (CEST)</div>Smatovichttps://www.chessprogramming.org/index.php?title=NNUE&diff=26901NNUE2024-01-14T14:34:01Z<p>Smatovic: /* 2023 ... */ date typo</p>
<hr />
<div>'''[[Main Page|Home]] * [[Learning]] * [[Neural Networks]] * NNUE'''<br />
<br />
[[FILE:SekienNue.jpg|border|right|thumb|250px| [[:Category:Toriyama Sekien|Toriyama Sekien]] - Nue (鵺) <ref>[https://en.wikipedia.org/wiki/Nue Nue] (鵺) from the [https://en.wikipedia.org/wiki/Konjaku_Gazu_Zoku_Hyakki Konjaku Gazu Zoku Hyakki] (今昔画図続百鬼) by [[:Category:Toriyama Sekien|Toriyama Sekien]], circa 1779, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74611&start=2 Re: What does NNUE actually mean] by [[ Henk Drost]], [[CCC]], July 29, 2020</ref> ]] <br />
<br />
'''NNUE''', (&#398;U&#1048;&#1048; Efficiently Updatable Neural Networks)<br/> <br />
a Neural Network architecture intended to replace the [[Evaluation|evaluation]] of [[Shogi]], [[Chess|chess]] and other board game playing [[Alpha-Beta|alpha-beta]] searchers running on a CPU. Inspired by [[Kunihito Hoki|Kunihito Hoki's]] approach of [[Piece-Square Tables|piece-square tables]] indexed by king location, and further two-piece locations and side to move as applied in his Shogi engine [[Bonanza]] <ref>[http://yaneuraou.yaneu.com/2020/05/03/%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7%BF%92%E3%82%A8%E3%83%B3%E3%82%B8%E3%83%8B%E3%82%A2%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E5%B0%86%E6%A3%8Bai%E9%96%8B%E7%99%BA%E5%85%A5%E9%96%80%E3%81%9D%E3%81%AE1/ 機械学習エンジニアのための将棋AI開発入門その1 Introduction to Shogi AI development for machine learning engineers Part 1], May 03, 2020 (Japanese)</ref>, '''NNUE''' was introduced in 2018 by [[Yu Nasu]] <ref>[[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''. Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf] (Japanese with English abstract)</ref>, and was used in Shogi adaptations of [[Stockfish]] such as [[YaneuraOu]] <ref>[https://github.com/yaneurao/YaneuraOu GitHub - yaneurao/YaneuraOu: YaneuraOu is the World's Strongest Shogi engine(AI player), WCSC29 1st winner, educational and USI compliant engine]</ref>,<br />
and [[Kristallweizen]] <ref>[https://github.com/Tama4649/Kristallweizen/ GitHub - Tama4649/Kristallweizen: 第29回世界コンピュータ将棋選手権 準優勝のKristallweizenです。]</ref>, apparently with [[AlphaZero]] strength <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754 The Stockfish of shogi] by [[Larry Kaufman]], [[CCC]], January 07, 2020</ref>. <br />
<br />
=[[Stockfish NNUE]]=<br />
As reported by [[Henk Drost]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74059 Stockfish NN release (NNUE)] by [[Henk Drost]], [[CCC]], May 31, 2020</ref>, <br />
[[Hisayori Noda|Nodchip]] incorporated NNUE into the chess playing [[Stockfish]] 10 as a proof of concept.<br />
[[Stockfish NNUE]] was born, and in summer 2020 the computer chess community bursts out enthusiastically due to its rapidly raising [[Playing Strength|playing strength]] with different networks trained using a mixture of [[Supervised Learning|supervised]] and [[Reinforcement Learning|reinforcement learning]] methods -<br />
despite the approximately halved search speed, becoming stronger than its original <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74484 Can the sardine! NNUE clobbers SF] by [[Henk Drost]], [[CCC]], July 16, 2020</ref>, finally responsible for the huge [[Playing Strength|strength]] improvement of '''Stockfish 12'''.<br />
<br />
=NNUE Engines=<br />
''see [[:Category:NNUE]]''<br />
<br />
Being tempted by the success of [[Stockfish NNUE]] and attracted by how easy the method and small the code is, many engine developers have started testing and applying [[NNUE]]. For quick trials and evaluating before going into serious development, some of them borrowed and/or rewrote NNUE code and use networks from Stockfish NNUE. Most of them reported positive results, such as [[David Carteau]] with [[Orion]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74828 Orion 0.7 : NNUE experiment] by [[David Carteau]], [[CCC]], August 19, 2020</ref>, [[Ehsan Rashid]] with [[DON]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72613&start=320#p856640 Re: New engine releases 2020...Don NNUE 2020?] by supersharp77, [[CCC]], August 19, 2020</ref>, various [[Stockfish#Derivatives|Stockfish derivatives]] by [[Michael Byrne]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=2&t=74825 ... the last shall be first ...] by [[Michael Byrne|MikeB]], [[CCC]], 19 Aug 2020</ref>, and [[Volodymyr Shcherbyna]] with [[Igel]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=2&t=67890&start=10#p856742 Introducing Igel chess engine] by [[Volodymyr Shcherbyna]], [[CCC]], 20 Aug 2020</ref> using the ''Night Nurse'' NNUE net by [[Dietrich Kappe]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=2&t=74837 Night Nurse 0.2] by [[Dietrich Kappe]], [[CCC]], August 19, 2020</ref>. [[Daniel Shawul]] added NNUE support à la [[CFish]] into his [[Scorpio#Bitbases|egbbdll]] probing library of [[Scorpio]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75400&start=22 Re: Hacking around CFish NNUE] by [[Daniel Shawul]], [[CCC]], October 15, 2020</ref> <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75415&start=3 Re: How to scale stockfish NNUE score?] by [[Daniel Shawul]], [[CCC]], October 17, 2020</ref>, making it even easier to use NNUE. The promising engines [[Halogen]] 7 and 8 by [[Kieren Pearson]], and [[Seer]] by [[Connor McMonigle]] came with their own, distinct NNUE implementations, and on November 10, 2020, the commercial [[Dragon by Komodo Chess]] aka [[Komodo]] NNUE appeared <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75748 Dragon by Komodo Chess] by [[Larry Kaufman]], [[CCC]], November 10, 2020</ref>, trying to close the gap to Stockfish NNUE. The commercial [[Fat Fritz#Fat Fritz 2|Fat Fritz 2.0]], based on a slightly modified Stockfish 12 using a customized, double sized network, was released by [[ChessBase]] in February 2021.<br />
<br />
=NN Structure=<br />
The neural network of Stockfish NNUE consists of four layers, W1 through W4. The input layer W1 is heavily overparametrized, feeding in the [[Board Representation|board representation]] for various king configurations.<br />
The efficiency of the net is due to [[Incremental Updates|incremental update]] of W1 in [[Make Move|make]] and [[Unmake Move|unmake move]],<br />
where only a fraction of its neurons need to be recalculated. The remaining three layers with 32x2x256, 32x32 and 32x1 weights are computational less expensive, best calculated using appropriate [[SIMD and SWAR Techniques|SIMD instructions]] like [[AVX2]] on [[x86-64]], or if available, [[AVX-512]].<br />
<br />
[[FILE:NNUE.jpg|none|border|text-bottom]] <br />
NNUE structure with [[Incremental Updates|incremental update]] <ref>Image from [[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''. Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf] (Japanese with English abstract)</ref><br />
<br />
=See also=<br />
* [[Cerebrum]]<br />
* [[David E. Moriarty#SANE|SANE]]<br />
* [[Stockfish NNUE#HalfKA|Stockfish HalfKAv2]]<br />
<br />
=Publications=<br />
* [[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''. Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf], [https://www.apply.computer-shogi.org/wcsc28/appeal/the_end_of_genesis_T.N.K.evolution_turbo_type_D/nnue.pdf pdf] (Japanese with English abstract) [https://github.com/asdfjkl/nnue GitHub - asdfjkl/nnue translation] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76250 Translation of Yu Nasu's NNUE paper] by [[Dominik Klein]], [[CCC]], January 07, 2021</ref><br />
* [[Dominik Klein]] ('''2021'''). ''[https://github.com/asdfjkl/neural_network_chess Neural Networks For Chess]''. [https://github.com/asdfjkl/neural_network_chess/releases/tag/v1.1 Release Version 1.1 · GitHub] <ref>[https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78283 Book about Neural Networks for Chess] by dkl, [[CCC]], September 29, 2021</ref><br />
<br />
=Forum Posts=<br />
==2020==<br />
===January ...===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754 The Stockfish of shogi] by [[Larry Kaufman]], [[CCC]], January 07, 2020 » [[Stockfish]], [[Shogi]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754&start=18 Re: The Stockfish of shogi] by [[Gian-Carlo Pascutto]], [[CCC]], January 18, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74058 Stockfish NNUE] by [[Henk Drost]], [[CCC]], May 31, 2020 » [[Stockfish]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74059 Stockfish NN release (NNUE)] by [[Henk Drost]], [[CCC]], May 31, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74148 NNUE shared library and tools] by [[Adam Treat]], [[CCC]], June 10, 2020<br />
===July===<br />
* [http://talkchess.com/forum3/viewtopic.php?t=74480 Lizard-NNUE Experiment NOT bad with NNUE Net Evaluation...] by Nancy M Pichardo, [[CCC]], July 15, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74484 Can the sardine! NNUE clobbers SF] by [[Henk Drost]], [[CCC]], July 16, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531 NNUE accessible explanation] by [[Martin Fierz]], [[CCC]], July 21, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=1 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 23, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=5 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 24, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=8 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], August 03, 2020<br />
* [https://groups.google.com/d/msg/fishcooking/Wpk-9COzk64/ez643VTkAAAJ BrainLearn NNUE 1.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], July 25, 2020 » [[BrainLearn]]<br />
* [https://groups.google.com/d/msg/fishcooking/yWtpz_FY5_Y/RMTG56fkAAAJ ShashChess NNUE 1.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], July 25, 2020 » [[ShashChess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74607 LC0 vs. NNUE - some tech details...] by [[Srdja Matovic]], [[CCC]], July 29, 2020 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74611 What does NNUE actually mean] by Paloma, [[CCC]], July 29, 2020<br />
===August===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74705 What happens with my hyperthreading?] by [[Kai Laskos]], [[CCC]], August 06, 2020 » [[Stockfish NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=73521&start=59 Re: Minic version 2] by [[Vivien Clauzon]], [[CCC]], August 08, 2020 » [[Minic]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74777 Neural Networks weights type] by [[Fabio Gobbato]], [[CCC]], August 13, 2020 » [[Stockfish NNUE]] <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67890&start=17 Re: Introducing Igel chess engine - Igel and NNUE] by [[Volodymyr Shcherbyna]], [[CCC]], August 19, 2020 » [[Igel]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74828 Orion 0.7 : NNUE experiment] by [[David Carteau]], [[CCC]], August 19, 2020 » [[Orion]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74837 Night Nurse 0.2] by [[Dietrich Kappe]], [[CCC]], August 19, 2020 » [[A0lite]], [[Igel]]<br />
* [http://laatste.info/bb3/viewtopic.php?f=53&t=8298 NNUE] by [[Bert Tuyt]], [http://laatste.info/bb3/viewforum.php?f=53 World Draughts Forum], August 19, 2020 » [[Draughts]]<br />
===September===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74955 Train a neural network evaluation] by [[Fabio Gobbato]], [[CCC]], September 01, 2020 » [[Automated Tuning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75016 RubiChess NNUE player implemented] by [[Andreas Matthies]], [[CCC]], September 06, 2020 » [[RubiChess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75027 Toga III 0.4 NNUE] by [[Dietrich Kappe]], [[CCC]], September 07, 2020 » [[Toga]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75042 Neural network quantization] by [[Fabio Gobbato]], [[CCC]], September 08, 2020 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75049 AVX-512 and NNUE] by [[Gian-Carlo Pascutto]], [[CCC]], September 08, 2020 » [[AVX-512]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75190 First success with neural nets] by [[Jonathan Kreuzer]], [[CCC]], September 23, 2020 » [[Neural Networks]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75190&start=21 Re: First success with neural nets] by [[Jonathan Kreuzer]], [[CCC]], November 11, 2020 » [[Checkers]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75241 Nemorino 6 (NNUE)] by [[Christian Günther|Florentino]], [[CCC]], September 28, 2020 » [[Nemorino]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75247 A Crossroad in Computer Chess; Or Desperate Flailing for Relevance] by [[Andrew Grant]], [[CCC]], September 29, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75248 NNUE variation] by [[Ed Schroder|Ed Schröder]], [[CCC]], September 29, 2020<br />
===October===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75296 BONA_PIECE_ZERO] by [[Marco Belli|elcabesa]], [[CCC]], October 04, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75335&start=91 Re: Final Release of Ethereal, V12.75] by [[Andrew Grant]], [[CCC]], October 09, 2020 » [[Ethereal]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75345 Request for someone to train an NNUE for Ethereal] by [[Andrew Grant]], [[CCC]], October 09, 2020 <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75350 Ethereal Tuning - Data Dump] by [[Andrew Grant]], [[CCC]], October 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75358 Dangerous turn] by [[Dann Corbit]], [[CCC]], October 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75393 Black crushing white, weird ?] by [[Vivien Clauzon]], [[CCC]], October 14, 2020 » [[Minic#MinicNNUE|MinicNNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75400 Hacking around CFish NNUE] by [[Maksim Korzh]], [[CCC]], October 15, 2020 » [[CFish]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75400&start=22 Re: Hacking around CFish NNUE] by [[Daniel Shawul]], [[CCC]], October 15, 2020 » [[Scorpio]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75415 How to scale stockfish NNUE score?] by [[Maksim Korzh]], [[CCC]], October 17, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75415&start=3 Re: How to scale stockfish NNUE score?] by [[Daniel Shawul]], [[CCC]], October 17, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75418 Embedding Stockfish NNUE to ANY CHESS ENGINE: YouTube series] by [[Maksim Korzh]], [[CCC]], October 17, 2020 » [[BBC]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75433 Seer] by [[Gerd Isenberg]], [[CCC]], October 18, 2020 » [[Seer]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75482 BBC 1.3 + Stockfish NNUE released!] by [[Maksim Korzh]], [[CCC]], October 21, 2020 » [[BBC]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75500 Mayhem NNUE - New NN engine] by [[Toni Helminen|JohnWoe]], [[CCC]], October 22, 2020 » [[Mayhem]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75501 Centipawns vs Millipawns with NNUE] by Madeleine Birchfield, [[CCC]], October 23, 2020 » [[Centipawns]], [[Millipawns]]<br />
* <span id="KingPlacements"></span>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506 NNUE Question - King Placements] by [[Andrew Grant]], [[CCC]], October 23, 2020 » [[Stockfish NNUE#NNUE Structure|Stockfish NNUE - NNUE Structure]]<br />
: [[#KingPlacementsCont|July 01, 2021 continuation]]<br />
===November===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75651 Komodo 14.1 Release and Dragon Announcement] by [[Larry Kaufman]], [[CCC]], November 02, 2020 » [[Komodo]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75653 NNUE outer product vs tensor product] by Madeleine Birchfield, [[CCC]], November 02, 2020 <ref>[https://en.wikipedia.org/wiki/Outer_product Outer product from Wikipedia]</ref> <ref>[https://en.wikipedia.org/wiki/Tensor_product Tensor product from Wikipedia]</ref><br />
* <span id="Training"></span>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020 <ref>[https://en.wikipedia.org/wiki/PyTorch PyTorch from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75725 TucaNNo: neural network research] by [[Alcides Schulz]], [[CCC]], November 08, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75748 Dragon by Komodo Chess] by [[Larry Kaufman]], [[CCC]], November 10, 2020 » [[Dragon by Komodo Chess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75751 Tensorflow NNUE training] by [[Daniel Shawul]], [[CCC]], November 10, 2020 <ref>[https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75890 Speculations about NNUE development (was New engine releases 2020)] by Madeleine Birchfield, [[CCC]], November 11, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75890&start=6 Re: Speculations about NNUE development (was New engine releases 2020)] by [[Connor McMonigle]], [[CCC]], November 12, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75890&start=9 Re: Speculations about NNUE development (was New engine releases 2020)] by [[Connor McMonigle]], [[CCC]], November 12, 2020 » [[Dragon by Komodo Chess]], [[Seer]], [[Halogen]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75335&start=134 Re: Final Release of Ethereal, V12.75] by [[Andrew Grant]], [[CCC]], November 12, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75797 Maybe not the best diversity of strongest chess engines under development] by [[Kai Laskos]], [[CCC]], November 14, 2020 » [[Engine Similarity]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75862 CPU Vector Unit, the new jam for NNs...] by [[Srdja Matovic]], [[CCC]], November 18, 2020 » [[SIMD and SWAR Techniques|SIMD]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75870 You've trained a brilliant NN(UE) King-Piece Network. Now what?] by [[Andrew Grant]], [[CCC]], November 19, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75925 Pawn King Neural Network] by [[Tamás Kuzmics]], [[CCC]], November 26, 2020<br />
===December===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75953 Orion 0.8 + The Cerebrum release] by [[David Carteau]], [[CCC]], December 01, 2020 » [[Orion]], [[Cerebrum]]<br />
* [https://prodeo.actieforum.com/t104-the-nnue-split-programmers-are-in The NNUE split programmers are in] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|ProDeo Forum]], December 02, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76006 Introducing the "Cerebrum" library (NNUE-like trainer and inference code)] by [[David Carteau]], [[CCC]], December 07, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76190 Dispelling the Myth of NNUE with LazySMP: An Analysis] by [[Andrew Grant]], [[CCC]], December 30, 2020 » [[Lazy SMP]]<br />
==2021==<br />
===January===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76250 Translation of Yu Nasu's NNUE paper] by [[Dominik Klein]], [[CCC]], January 07, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724&start=60 Re: Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], January 09, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76263 More experiments with neural nets] by [[Jonathan Kreuzer]], [[CCC]], January 09, 2021 » [[Slow Chess]]<br />
* [https://groups.google.com/g/fishcooking/c/cad1MGSdpU4/m/Ury4iBqSBgAJ Shouldn't positional attributes drive SF's NNUE input features (rather than king position)?] by [[Nick Pelling]], [[Computer Chess Forums|FishCooking]], January 10, 2021 » [[Stockfish NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76285 HalfKP Structure in NNUE] by Roger Stephenson, [[CCC]], January 12, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76346 Andscacs nnue 0.1] by [[Daniel José Queraltó]], [[CCC]], January 17, 2021 » [[Andscacs]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76353 It's NNUE era (sharing my thoughts)] by Basti Dangca, [[CCC]], January 18, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76356 NNUE and game phase] by [[Dann Corbit]], [[CCC]], January 18, 2021 » [[Game Phases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76382 correspondence chess in the age of NNUE] by [[Larry Kaufman]], [[CCC]], January 21, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76386 One for Andrew Grant et al. - NNUE?] by [[Srdja Matovic]], [[CCC]], January 21, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76437 256 in NNUE?] by Ted Wong, [[CCC]], January 28, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76446 So what do we miss in the traditional evaluation?] by [[Ferdinand Mosca]], [[CCC]], January 29, 2021 » [[Evaluation]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76447 Latest Night Nurse released] by [[Dietrich Kappe]], [[CCC]], January 29, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76456 None-GPL NNUE probing code] by [[Daniel Shawul]], [[CCC]], January 31, 2021<br />
===February===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76537 Fat Fritz 2] by [[Jouni Uski]], [[CCC]], February 09, 2021 » [[Fat Fritz#Fat Fritz 2|Fat Fritz 2.0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76552 How much work is it to train an NNUE?] by [[Gabor Szots]], [[CCC]], February 11, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76556 HCE and NNUE and vectorisation] by [[Vivien Clauzon]], [[CCC]], February 11, 2021 » [[Minic]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76570 nnue reading code] by [[Jon Dart]], [[CCC]], February 13, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76648 New net: The White Rose] by [[Dietrich Kappe]], [[CCC]], February 20, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76664 Are neural nets (the weights file) copyrightable?] by [[Adam Treat]], [[CCC]], February 21, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76731 My first NNUE nn-f0c1c3cbf2f1.nnue] by [[Michael Byrne|MikeB]], [[CCC]], February 27, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76742 How to make a double-sized net as good as SF NNUE in a few easy steps] by [[Chris Whittington]], [[CCC]], February 28, 2021 » [[Fat Fritz#Fat Fritz 2|Fat Fritz 2.0]]<br />
===March===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76790 A random walk down NNUE street …] by [[Michael Byrne|MikeB]], [[CCC]], March 06, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76833 NNUE Research Project] by [[Ed Schroder|Ed Schröder]], [[CCC]], March 10, 2021 » [[Engine Similarity]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76840 Simex including NNUE] by jjoshua2, [[CCC]], March 11, 2021 » [[Engine Similarity]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76844 NNUE ranking] by Jim Logan, [[CCC]], March 12, 2021 » [[Stockfish NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76892 FEN compression] by lucasart, [[CCC]], March 17, 2021 » [[BMI2#FEN Compression|FEN Compression]], [[#Training|NNUE Training]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76917 Mabigat - hyperparameter optimizer for NNUE net] by [[Ferdinand Mosca]], [[CCC]], March 22, 2021 » [[Automated Tuning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76964 nnue-trainer] by [[Jon Dart]], [[CCC]], March 27, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[GPU]]<br />
===April===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77157 Rubichess NN questions] by [[Jon Dart]], [[CCC]], April 23, 2021 » [[RubiChess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77200 Crafty NNUE Chess Engine?] by supersharp77, [[CCC]], April 29, 2021 » [[Crafty]], [[Vafra]] <ref>[http://www.jurjevic.org.uk/chess/vafra/index.html Vafra] by [[Robert Jurjević]]</ref><br />
===May===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77244 Komodo Dragon 2 released] by [[Larry Kaufman]], [[CCC]], May 04, 2021 » [[Dragon by Komodo Chess|Komodo Dragon]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77344 Stockfish with new NNUE architecture and bigger net released] by [[Stefan Pohl]], [[CCC]], May 19, 2021 » [[Stockfish]], [[Stockfish NNUE]] <ref>[https://github.com/official-stockfish/Stockfish/pull/3474 Update default net to nn-8a08400ed089.nnue by Sopel97 · Pull Request #3474 · official-stockfish/Stockfish · GitHub] by [[Tomasz Sobczyk]]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77348 NNUE scoring (egbb lib)] by [[Michael Hoffmann|Desperado]], [[CCC]], May 19, 2021 » [[Scorpio#ScorpioNNUE|Scorpio NNUE]]<br />
===June===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77243&start=20 Re: Booot progress] by [[Alex Morozov]], [[CCC]], June 01, 2021 » [[Booot]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77438 Commercial Release of Ethereal 13.00 (NNUE) for AVX2 Systems] by [[Andrew Grant]], [[CCC]], June 04, 2021 » [[Ethereal#Ethereal 13 (NNUE)|Ethereal 13.00 (NNUE)]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77438&start=17 Re: Commercial Release of Ethereal 13.00 (NNUE) for AVX2 Systems] by [[Andrew Grant]], [[CCC]], June 04, 2021 » [[Stockfish NNUE#NNUE Structure|HalfKP]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77467 Dark Horse Update] by [[Dietrich Kappe]], [[CCC]], June 11, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77492 Some more experiments with neural nets] by [[Jonathan Kreuzer]], [[CCC]], June 15, 2021 » [[Slow Chess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77503&start=55 Re: will Tcec allow Stockfish with a Leela net to play?] by [[Connor McMonigle]], [[CCC]], June 17, 2021 » [[Stockfish]], [[Dragon by Komodo Chess|Komodo Dragon]], [[Ethereal]], [[Seer]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77503&start=63 Re: will Tcec allow Stockfish with a Leela net to play?] by [[Daniel Shawul]], [[CCC]], June 18, 2021 » [[Scorpio]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77503&start=68 Re: will Tcec allow Stockfish with a Leela net to play?] by [[Vivien Clauzon]], [[CCC]], June 18, 2021 » [[Minic]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77571 I declare that HCE is dead...] by [[Andrew Grant]], [[CCC]], June 29, 2021 » [[Ethereal]], [[Evaluation|HCE]]<br />
===July===<br />
* <span id="KingPlacementsCont"></span>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506&start=39 Re: NNUE Question - King Placements] by [[Tomasz Sobczyk]], [[CCC]], July 01, 2021 » [[#KingPlacements|NNUE Question]]<br />
: [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506&start=40 Re: NNUE Question - King Placements] by [[Daniel Shawul]], July 01, 2021 » [[Scorpio#ScorpioNNUE|ScorpioNNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77602 Before things become more messy than they already are] by [[Ed Schroder|Ed Schröder]], [[CCC]], July 02, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77606 NNUE training set generation] by [[Edsel Apostol]], [[CCC]], July 03, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77612 Time to rethink what Albert Silver has done?] by [[Srdja Matovic]], [[CCC]], July 03, 2021 » [[Fat Fritz#Fat Fritz 2|Fat Fritz 2]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77639 Would the ICGA have accepted today's NNUE engines?] by Madeleine Birchfield, [[CCC]], July 05, 2021 » [[ICGA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77664 Koivisto 5.0] by [[Finn Eggers]], [[CCC]], July 07, 2021 » [[Koivisto]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77681 NNUE one year retrospective] by Madeleine Birchfield, [[CCC]], July 09, 2021<br />
===August ...===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77869 Basic NNUE questions] by [[Amanj Sherwany]], [[CCC]], August 04, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=78109 Alternatives to King-Pawn, King-All NNUE encoding] by [[Andrew Grant]], [[CCC]], September 05, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=78394 NNUE - Efficiently Updatable Network - understanding] by [[Daniel Infuehr]], [[CCC]], October 11, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78497 NNUE - only from own engine?] by [[Ed Schroder|Rebel]], October 25, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78588 Regarding AVX2] by [[Ed Schroder|Rebel]], [[CCC]], November 03, 2021 » [[AVX2]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78855 Mantissa 3.0.0] by [[Jeremy Wright]], [[CCC]], December 10, 2021 » [[Mantissa]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78979 Are NNUE Nets Specific to Chess Engines or They Universal to All Engines?] by daniel71, [[CCC]], December 26, 2021 <br />
==2022 ...==<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79020 Why NNUE trainer requires an online qsearch on each training position?] by [[Chao Ma]], [[CCC]], January 01, 2022<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=79107 Rebel 14] by [[Ed Schroder|Ed Schröder]], [[CCC]], January 12, 2022 » [[Rebel#14|Rebel 14]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=79523 Koivisto 8.0] by [[Finn Eggers]], [[CCC]], March 15, 2022 » [[Koivisto]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79742 NNUE + Pawn-King Network] by Alvin Peng, [[CCC]], April 22, 2022<br />
==2024 ...==<br />
* [https://talkchess.com/forum3/viewtopic.php?f=7&t=83170 How to get started with NNUE] by Arjun Basandrai, [[CCC]], January 12, 2024<br />
<br />
=External Links=<br />
==NNUE==<br />
* [https://en.wikipedia.org/wiki/Efficiently_updatable_neural_network Efficiently updatable neural network | Wikipedia]<br />
* [http://qhapaq.hatenablog.com/entry/2018/06/02/221612 次世代の将棋思考エンジン、NNUE関数を学ぼう(その1.ネットワーク構造編) - コンピュータ将棋 Qhapaq], June 02, 2018 (Japanese)<br />
: Learn Next Generation Shogi Thinking Engine, NNUE Function (Part 1. Network Structure) - Computer Shogi<br />
* [http://qhapaq.hatenablog.com/entry/2018/07/08/193316 次世代の将棋思考エンジン、NNUE関数を学ぼう(その2.改造/学習編) - コンピュータ将棋 Qhapaq], July 08, 2018 (Japanese)<br />
: Let's Learn Next Generation Shogi Thinking Engine, NNUE Function (Part 2. Remodeling/Learning) - Computer Shogi<br />
* [http://yaneuraou.yaneu.com/2020/06/19/stockfish-nnue-the-complete-guide/ Stockfish NNUE – The Complete Guide], June 19, 2020 (Japanese and English)<br />
* [http://yaneuraou.yaneu.com/2020/08/21/3-technologies-in-shogi-ai-that-could-be-used-for-chess-ai/ 3 technologies in shogi AI that could be used for chess AI] by [[Motohiro Isozaki]], August 21, 2020 » [[Stockfish NNUE]]<br />
* [https://www.qhapaq.org/shogi/shogiwiki/stockfish-nnue/ Stockfish NNUE Wiki]<br />
* [http://rebel13.nl/home/nnue.html nnue | Home of the Dutch Rebel] by [[Ed Schroder|Ed Schröder]]<br />
* [https://github.com/glinscott/nnue-pytorch/blob/master/docs/nnue.md NNUE Guide (nnue-pytorch/nnue.md at master · glinscott/nnue-pytorch · GitHub)] hosted by [[Gary Linscott]]<br />
==NNUE libraries==<br />
Some developers disintegrate and rewrite the Stockfish NNUE code into independent libraries which can be much easier to embed into other chess engines.<br />
* [https://github.com/david-carteau/cerebrum GitHub - david-carteau/cerebrum: The Cerebrum library] by [[David Carteau]] » [[Cerebrum]]<br />
* [https://github.com/dshawul/nncpu-probe GitHub - dshawul/nncpu-probe] by [[Daniel Shawul]]<br />
* [https://github.com/jdart1/nnue GitHub - jdart1/nnue: NNUE reading code for chess] by [[Jon Dart]]<br />
==Source Code==<br />
* [https://github.com/yaneurao/YaneuraOu GitHub - yaneurao/YaneuraOu: YaneuraOu is the World's Strongest Shogi engine(AI player), WCSC29 1st winner, educational and USI compliant engine]<br />
* [https://github.com/Tama4649/Kristallweizen/ GitHub - Tama4649/Kristallweizen: 第29回世界コンピュータ将棋選手権 準優勝のKristallweizenです。]<br />
* [https://github.com/nodchip/Stockfish GitHub - nodchip/Stockfish: UCI chess engine] ([[Stockfish NNUE]] by [[Hisayori Noda|Nodchip]])<br />
* [https://github.com/dkappe/leela-chess-weights/wiki/A-Leela-NNUE%3F-Night-Nurse-and-Others A Leela NNUE? Night Nurse and Others · dkappe/leela-chess-weights Wiki · GitHub] by [[Dietrich Kappe]]<br />
* [https://github.com/DanielUranga/TensorFlowNNUE GitHub - DanielUranga/TensorFlowNNUE] by [[Daniel Uranga]]<br />
* [https://github.com/glinscott/nnue-pytorch GitHub - glinscott/nnue-pytorch: NNUE (Chess evaluation) trainer in Pytorch] by [[Gary Linscott]] <br />
* [https://github.com/connormcmonigle/seer-nnue GitHub - connormcmonigle/seer-nnue: UCI chess engine using neural networks for position evaluation] by [[Connor McMonigle]] » [[Seer]]<br />
* [https://github.com/bmdanielsson/nnue-trainer GitHub - bmdanielsson/nnue-trainer: PyTorch trainer for NNUE style neural networks] by [[Martin Danielsson]] » [[Marvin]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76964 nnue-trainer] by [[Jon Dart]], [[CCC]], March 27, 2021</ref><br />
* [https://github.com/fsmosca/Mabigat GitHub - fsmosca/Mabigat: NNUE parameter optimizer] by [[Ferdinand Mosca]] » [[Automated Tuning]]<br />
==Misc==<br />
* [https://software.intel.com/content/www/us/en/develop/articles/lower-numerical-precision-deep-learning-inference-and-training.html Lower Numerical Precision Deep Learning Inference and Training] by [https://community.intel.com/t5/user/viewprofilepage/user-id/134067 Andres Rodriguez] et al., [[Intel]], January 19, 2018 » [[AVX-512]]<br />
* [https://en.wikipedia.org/wiki/Nue Nue from Wikipedia]<br />
* [[:Category:Hiromi Uehara|Hiromi]] - [https://en.wikipedia.org/wiki/Spectrum_(Hiromi_album) Spectrum], 2019, [https://en.wikipedia.org/wiki/YouTube YouTube] Video<br />
: {{#evu:https://www.youtube.com/watch?v=A8RCz_RoefM|alignment=left|valignment=top}} <br />
<br />
=References= <br />
<references /><br />
'''[[Neural Networks|Up one Level]]'''<br />
[[Category:Toriyama Sekien]]<br />
[[Category:Hiromi Uehara]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=NNUE&diff=26900NNUE2024-01-13T16:27:08Z<p>Smatovic: /* Forum Posts */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Learning]] * [[Neural Networks]] * NNUE'''<br />
<br />
[[FILE:SekienNue.jpg|border|right|thumb|250px| [[:Category:Toriyama Sekien|Toriyama Sekien]] - Nue (鵺) <ref>[https://en.wikipedia.org/wiki/Nue Nue] (鵺) from the [https://en.wikipedia.org/wiki/Konjaku_Gazu_Zoku_Hyakki Konjaku Gazu Zoku Hyakki] (今昔画図続百鬼) by [[:Category:Toriyama Sekien|Toriyama Sekien]], circa 1779, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74611&start=2 Re: What does NNUE actually mean] by [[ Henk Drost]], [[CCC]], July 29, 2020</ref> ]] <br />
<br />
'''NNUE''', (&#398;U&#1048;&#1048; Efficiently Updatable Neural Networks)<br/> <br />
a Neural Network architecture intended to replace the [[Evaluation|evaluation]] of [[Shogi]], [[Chess|chess]] and other board game playing [[Alpha-Beta|alpha-beta]] searchers running on a CPU. Inspired by [[Kunihito Hoki|Kunihito Hoki's]] approach of [[Piece-Square Tables|piece-square tables]] indexed by king location, and further two-piece locations and side to move as applied in his Shogi engine [[Bonanza]] <ref>[http://yaneuraou.yaneu.com/2020/05/03/%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7%BF%92%E3%82%A8%E3%83%B3%E3%82%B8%E3%83%8B%E3%82%A2%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E5%B0%86%E6%A3%8Bai%E9%96%8B%E7%99%BA%E5%85%A5%E9%96%80%E3%81%9D%E3%81%AE1/ 機械学習エンジニアのための将棋AI開発入門その1 Introduction to Shogi AI development for machine learning engineers Part 1], May 03, 2020 (Japanese)</ref>, '''NNUE''' was introduced in 2018 by [[Yu Nasu]] <ref>[[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''. Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf] (Japanese with English abstract)</ref>, and was used in Shogi adaptations of [[Stockfish]] such as [[YaneuraOu]] <ref>[https://github.com/yaneurao/YaneuraOu GitHub - yaneurao/YaneuraOu: YaneuraOu is the World's Strongest Shogi engine(AI player), WCSC29 1st winner, educational and USI compliant engine]</ref>,<br />
and [[Kristallweizen]] <ref>[https://github.com/Tama4649/Kristallweizen/ GitHub - Tama4649/Kristallweizen: 第29回世界コンピュータ将棋選手権 準優勝のKristallweizenです。]</ref>, apparently with [[AlphaZero]] strength <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754 The Stockfish of shogi] by [[Larry Kaufman]], [[CCC]], January 07, 2020</ref>. <br />
<br />
=[[Stockfish NNUE]]=<br />
As reported by [[Henk Drost]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74059 Stockfish NN release (NNUE)] by [[Henk Drost]], [[CCC]], May 31, 2020</ref>, <br />
[[Hisayori Noda|Nodchip]] incorporated NNUE into the chess playing [[Stockfish]] 10 as a proof of concept.<br />
[[Stockfish NNUE]] was born, and in summer 2020 the computer chess community bursts out enthusiastically due to its rapidly raising [[Playing Strength|playing strength]] with different networks trained using a mixture of [[Supervised Learning|supervised]] and [[Reinforcement Learning|reinforcement learning]] methods -<br />
despite the approximately halved search speed, becoming stronger than its original <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74484 Can the sardine! NNUE clobbers SF] by [[Henk Drost]], [[CCC]], July 16, 2020</ref>, finally responsible for the huge [[Playing Strength|strength]] improvement of '''Stockfish 12'''.<br />
<br />
=NNUE Engines=<br />
''see [[:Category:NNUE]]''<br />
<br />
Being tempted by the success of [[Stockfish NNUE]] and attracted by how easy the method and small the code is, many engine developers have started testing and applying [[NNUE]]. For quick trials and evaluating before going into serious development, some of them borrowed and/or rewrote NNUE code and use networks from Stockfish NNUE. Most of them reported positive results, such as [[David Carteau]] with [[Orion]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74828 Orion 0.7 : NNUE experiment] by [[David Carteau]], [[CCC]], August 19, 2020</ref>, [[Ehsan Rashid]] with [[DON]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72613&start=320#p856640 Re: New engine releases 2020...Don NNUE 2020?] by supersharp77, [[CCC]], August 19, 2020</ref>, various [[Stockfish#Derivatives|Stockfish derivatives]] by [[Michael Byrne]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=2&t=74825 ... the last shall be first ...] by [[Michael Byrne|MikeB]], [[CCC]], 19 Aug 2020</ref>, and [[Volodymyr Shcherbyna]] with [[Igel]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=2&t=67890&start=10#p856742 Introducing Igel chess engine] by [[Volodymyr Shcherbyna]], [[CCC]], 20 Aug 2020</ref> using the ''Night Nurse'' NNUE net by [[Dietrich Kappe]] <ref>[http://talkchess.com/forum3/viewtopic.php?f=2&t=74837 Night Nurse 0.2] by [[Dietrich Kappe]], [[CCC]], August 19, 2020</ref>. [[Daniel Shawul]] added NNUE support à la [[CFish]] into his [[Scorpio#Bitbases|egbbdll]] probing library of [[Scorpio]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75400&start=22 Re: Hacking around CFish NNUE] by [[Daniel Shawul]], [[CCC]], October 15, 2020</ref> <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75415&start=3 Re: How to scale stockfish NNUE score?] by [[Daniel Shawul]], [[CCC]], October 17, 2020</ref>, making it even easier to use NNUE. The promising engines [[Halogen]] 7 and 8 by [[Kieren Pearson]], and [[Seer]] by [[Connor McMonigle]] came with their own, distinct NNUE implementations, and on November 10, 2020, the commercial [[Dragon by Komodo Chess]] aka [[Komodo]] NNUE appeared <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75748 Dragon by Komodo Chess] by [[Larry Kaufman]], [[CCC]], November 10, 2020</ref>, trying to close the gap to Stockfish NNUE. The commercial [[Fat Fritz#Fat Fritz 2|Fat Fritz 2.0]], based on a slightly modified Stockfish 12 using a customized, double sized network, was released by [[ChessBase]] in February 2021.<br />
<br />
=NN Structure=<br />
The neural network of Stockfish NNUE consists of four layers, W1 through W4. The input layer W1 is heavily overparametrized, feeding in the [[Board Representation|board representation]] for various king configurations.<br />
The efficiency of the net is due to [[Incremental Updates|incremental update]] of W1 in [[Make Move|make]] and [[Unmake Move|unmake move]],<br />
where only a fraction of its neurons need to be recalculated. The remaining three layers with 32x2x256, 32x32 and 32x1 weights are computational less expensive, best calculated using appropriate [[SIMD and SWAR Techniques|SIMD instructions]] like [[AVX2]] on [[x86-64]], or if available, [[AVX-512]].<br />
<br />
[[FILE:NNUE.jpg|none|border|text-bottom]] <br />
NNUE structure with [[Incremental Updates|incremental update]] <ref>Image from [[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''. Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf] (Japanese with English abstract)</ref><br />
<br />
=See also=<br />
* [[Cerebrum]]<br />
* [[David E. Moriarty#SANE|SANE]]<br />
* [[Stockfish NNUE#HalfKA|Stockfish HalfKAv2]]<br />
<br />
=Publications=<br />
* [[Yu Nasu]] ('''2018'''). ''&#398;U&#1048;&#1048; Efficiently Updatable Neural-Network based Evaluation Functions for Computer Shogi''. Ziosoft Computer Shogi Club, [https://github.com/ynasu87/nnue/blob/master/docs/nnue.pdf pdf], [https://www.apply.computer-shogi.org/wcsc28/appeal/the_end_of_genesis_T.N.K.evolution_turbo_type_D/nnue.pdf pdf] (Japanese with English abstract) [https://github.com/asdfjkl/nnue GitHub - asdfjkl/nnue translation] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76250 Translation of Yu Nasu's NNUE paper] by [[Dominik Klein]], [[CCC]], January 07, 2021</ref><br />
* [[Dominik Klein]] ('''2021'''). ''[https://github.com/asdfjkl/neural_network_chess Neural Networks For Chess]''. [https://github.com/asdfjkl/neural_network_chess/releases/tag/v1.1 Release Version 1.1 · GitHub] <ref>[https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78283 Book about Neural Networks for Chess] by dkl, [[CCC]], September 29, 2021</ref><br />
<br />
=Forum Posts=<br />
==2020==<br />
===January ...===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754 The Stockfish of shogi] by [[Larry Kaufman]], [[CCC]], January 07, 2020 » [[Stockfish]], [[Shogi]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72754&start=18 Re: The Stockfish of shogi] by [[Gian-Carlo Pascutto]], [[CCC]], January 18, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74058 Stockfish NNUE] by [[Henk Drost]], [[CCC]], May 31, 2020 » [[Stockfish]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74059 Stockfish NN release (NNUE)] by [[Henk Drost]], [[CCC]], May 31, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74148 NNUE shared library and tools] by [[Adam Treat]], [[CCC]], June 10, 2020<br />
===July===<br />
* [http://talkchess.com/forum3/viewtopic.php?t=74480 Lizard-NNUE Experiment NOT bad with NNUE Net Evaluation...] by Nancy M Pichardo, [[CCC]], July 15, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74484 Can the sardine! NNUE clobbers SF] by [[Henk Drost]], [[CCC]], July 16, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531 NNUE accessible explanation] by [[Martin Fierz]], [[CCC]], July 21, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=1 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 23, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=5 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], July 24, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74531&start=8 Re: NNUE accessible explanation] by [[Jonathan Rosenthal]], [[CCC]], August 03, 2020<br />
* [https://groups.google.com/d/msg/fishcooking/Wpk-9COzk64/ez643VTkAAAJ BrainLearn NNUE 1.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], July 25, 2020 » [[BrainLearn]]<br />
* [https://groups.google.com/d/msg/fishcooking/yWtpz_FY5_Y/RMTG56fkAAAJ ShashChess NNUE 1.0] by [[Andrea Manzo]], [[Computer Chess Forums|FishCooking]], July 25, 2020 » [[ShashChess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74607 LC0 vs. NNUE - some tech details...] by [[Srdja Matovic]], [[CCC]], July 29, 2020 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74611 What does NNUE actually mean] by Paloma, [[CCC]], July 29, 2020<br />
===August===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74705 What happens with my hyperthreading?] by [[Kai Laskos]], [[CCC]], August 06, 2020 » [[Stockfish NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=73521&start=59 Re: Minic version 2] by [[Vivien Clauzon]], [[CCC]], August 08, 2020 » [[Minic]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74777 Neural Networks weights type] by [[Fabio Gobbato]], [[CCC]], August 13, 2020 » [[Stockfish NNUE]] <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67890&start=17 Re: Introducing Igel chess engine - Igel and NNUE] by [[Volodymyr Shcherbyna]], [[CCC]], August 19, 2020 » [[Igel]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74828 Orion 0.7 : NNUE experiment] by [[David Carteau]], [[CCC]], August 19, 2020 » [[Orion]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=74837 Night Nurse 0.2] by [[Dietrich Kappe]], [[CCC]], August 19, 2020 » [[A0lite]], [[Igel]]<br />
* [http://laatste.info/bb3/viewtopic.php?f=53&t=8298 NNUE] by [[Bert Tuyt]], [http://laatste.info/bb3/viewforum.php?f=53 World Draughts Forum], August 19, 2020 » [[Draughts]]<br />
===September===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74955 Train a neural network evaluation] by [[Fabio Gobbato]], [[CCC]], September 01, 2020 » [[Automated Tuning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75016 RubiChess NNUE player implemented] by [[Andreas Matthies]], [[CCC]], September 06, 2020 » [[RubiChess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75027 Toga III 0.4 NNUE] by [[Dietrich Kappe]], [[CCC]], September 07, 2020 » [[Toga]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75042 Neural network quantization] by [[Fabio Gobbato]], [[CCC]], September 08, 2020 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75049 AVX-512 and NNUE] by [[Gian-Carlo Pascutto]], [[CCC]], September 08, 2020 » [[AVX-512]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75190 First success with neural nets] by [[Jonathan Kreuzer]], [[CCC]], September 23, 2020 » [[Neural Networks]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75190&start=21 Re: First success with neural nets] by [[Jonathan Kreuzer]], [[CCC]], November 11, 2020 » [[Checkers]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75241 Nemorino 6 (NNUE)] by [[Christian Günther|Florentino]], [[CCC]], September 28, 2020 » [[Nemorino]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75247 A Crossroad in Computer Chess; Or Desperate Flailing for Relevance] by [[Andrew Grant]], [[CCC]], September 29, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75248 NNUE variation] by [[Ed Schroder|Ed Schröder]], [[CCC]], September 29, 2020<br />
===October===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75296 BONA_PIECE_ZERO] by [[Marco Belli|elcabesa]], [[CCC]], October 04, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75335&start=91 Re: Final Release of Ethereal, V12.75] by [[Andrew Grant]], [[CCC]], October 09, 2020 » [[Ethereal]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75345 Request for someone to train an NNUE for Ethereal] by [[Andrew Grant]], [[CCC]], October 09, 2020 <br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75350 Ethereal Tuning - Data Dump] by [[Andrew Grant]], [[CCC]], October 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75358 Dangerous turn] by [[Dann Corbit]], [[CCC]], October 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75393 Black crushing white, weird ?] by [[Vivien Clauzon]], [[CCC]], October 14, 2020 » [[Minic#MinicNNUE|MinicNNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75400 Hacking around CFish NNUE] by [[Maksim Korzh]], [[CCC]], October 15, 2020 » [[CFish]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75400&start=22 Re: Hacking around CFish NNUE] by [[Daniel Shawul]], [[CCC]], October 15, 2020 » [[Scorpio]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75415 How to scale stockfish NNUE score?] by [[Maksim Korzh]], [[CCC]], October 17, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75415&start=3 Re: How to scale stockfish NNUE score?] by [[Daniel Shawul]], [[CCC]], October 17, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75418 Embedding Stockfish NNUE to ANY CHESS ENGINE: YouTube series] by [[Maksim Korzh]], [[CCC]], October 17, 2020 » [[BBC]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75433 Seer] by [[Gerd Isenberg]], [[CCC]], October 18, 2020 » [[Seer]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75482 BBC 1.3 + Stockfish NNUE released!] by [[Maksim Korzh]], [[CCC]], October 21, 2020 » [[BBC]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75500 Mayhem NNUE - New NN engine] by [[Toni Helminen|JohnWoe]], [[CCC]], October 22, 2020 » [[Mayhem]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75501 Centipawns vs Millipawns with NNUE] by Madeleine Birchfield, [[CCC]], October 23, 2020 » [[Centipawns]], [[Millipawns]]<br />
* <span id="KingPlacements"></span>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506 NNUE Question - King Placements] by [[Andrew Grant]], [[CCC]], October 23, 2020 » [[Stockfish NNUE#NNUE Structure|Stockfish NNUE - NNUE Structure]]<br />
: [[#KingPlacementsCont|July 01, 2021 continuation]]<br />
===November===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75651 Komodo 14.1 Release and Dragon Announcement] by [[Larry Kaufman]], [[CCC]], November 02, 2020 » [[Komodo]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75653 NNUE outer product vs tensor product] by Madeleine Birchfield, [[CCC]], November 02, 2020 <ref>[https://en.wikipedia.org/wiki/Outer_product Outer product from Wikipedia]</ref> <ref>[https://en.wikipedia.org/wiki/Tensor_product Tensor product from Wikipedia]</ref><br />
* <span id="Training"></span>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020 <ref>[https://en.wikipedia.org/wiki/PyTorch PyTorch from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75725 TucaNNo: neural network research] by [[Alcides Schulz]], [[CCC]], November 08, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75748 Dragon by Komodo Chess] by [[Larry Kaufman]], [[CCC]], November 10, 2020 » [[Dragon by Komodo Chess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75751 Tensorflow NNUE training] by [[Daniel Shawul]], [[CCC]], November 10, 2020 <ref>[https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75890 Speculations about NNUE development (was New engine releases 2020)] by Madeleine Birchfield, [[CCC]], November 11, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75890&start=6 Re: Speculations about NNUE development (was New engine releases 2020)] by [[Connor McMonigle]], [[CCC]], November 12, 2020<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75890&start=9 Re: Speculations about NNUE development (was New engine releases 2020)] by [[Connor McMonigle]], [[CCC]], November 12, 2020 » [[Dragon by Komodo Chess]], [[Seer]], [[Halogen]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75335&start=134 Re: Final Release of Ethereal, V12.75] by [[Andrew Grant]], [[CCC]], November 12, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75797 Maybe not the best diversity of strongest chess engines under development] by [[Kai Laskos]], [[CCC]], November 14, 2020 » [[Engine Similarity]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75862 CPU Vector Unit, the new jam for NNs...] by [[Srdja Matovic]], [[CCC]], November 18, 2020 » [[SIMD and SWAR Techniques|SIMD]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75870 You've trained a brilliant NN(UE) King-Piece Network. Now what?] by [[Andrew Grant]], [[CCC]], November 19, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75925 Pawn King Neural Network] by [[Tamás Kuzmics]], [[CCC]], November 26, 2020<br />
===December===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75953 Orion 0.8 + The Cerebrum release] by [[David Carteau]], [[CCC]], December 01, 2020 » [[Orion]], [[Cerebrum]]<br />
* [https://prodeo.actieforum.com/t104-the-nnue-split-programmers-are-in The NNUE split programmers are in] by [[Ed Schroder|Ed Schröder]], [[Computer Chess Forums|ProDeo Forum]], December 02, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76006 Introducing the "Cerebrum" library (NNUE-like trainer and inference code)] by [[David Carteau]], [[CCC]], December 07, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76190 Dispelling the Myth of NNUE with LazySMP: An Analysis] by [[Andrew Grant]], [[CCC]], December 30, 2020 » [[Lazy SMP]]<br />
==2021==<br />
===January===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76250 Translation of Yu Nasu's NNUE paper] by [[Dominik Klein]], [[CCC]], January 07, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724&start=60 Re: Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], January 09, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76263 More experiments with neural nets] by [[Jonathan Kreuzer]], [[CCC]], January 09, 2021 » [[Slow Chess]]<br />
* [https://groups.google.com/g/fishcooking/c/cad1MGSdpU4/m/Ury4iBqSBgAJ Shouldn't positional attributes drive SF's NNUE input features (rather than king position)?] by [[Nick Pelling]], [[Computer Chess Forums|FishCooking]], January 10, 2021 » [[Stockfish NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76285 HalfKP Structure in NNUE] by Roger Stephenson, [[CCC]], January 12, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76346 Andscacs nnue 0.1] by [[Daniel José Queraltó]], [[CCC]], January 17, 2021 » [[Andscacs]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76353 It's NNUE era (sharing my thoughts)] by Basti Dangca, [[CCC]], January 18, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76356 NNUE and game phase] by [[Dann Corbit]], [[CCC]], January 18, 2021 » [[Game Phases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76382 correspondence chess in the age of NNUE] by [[Larry Kaufman]], [[CCC]], January 21, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76386 One for Andrew Grant et al. - NNUE?] by [[Srdja Matovic]], [[CCC]], January 21, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76437 256 in NNUE?] by Ted Wong, [[CCC]], January 28, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76446 So what do we miss in the traditional evaluation?] by [[Ferdinand Mosca]], [[CCC]], January 29, 2021 » [[Evaluation]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76447 Latest Night Nurse released] by [[Dietrich Kappe]], [[CCC]], January 29, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76456 None-GPL NNUE probing code] by [[Daniel Shawul]], [[CCC]], January 31, 2021<br />
===February===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76537 Fat Fritz 2] by [[Jouni Uski]], [[CCC]], February 09, 2021 » [[Fat Fritz#Fat Fritz 2|Fat Fritz 2.0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76552 How much work is it to train an NNUE?] by [[Gabor Szots]], [[CCC]], February 11, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76556 HCE and NNUE and vectorisation] by [[Vivien Clauzon]], [[CCC]], February 11, 2021 » [[Minic]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76570 nnue reading code] by [[Jon Dart]], [[CCC]], February 13, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76648 New net: The White Rose] by [[Dietrich Kappe]], [[CCC]], February 20, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76664 Are neural nets (the weights file) copyrightable?] by [[Adam Treat]], [[CCC]], February 21, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76731 My first NNUE nn-f0c1c3cbf2f1.nnue] by [[Michael Byrne|MikeB]], [[CCC]], February 27, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76742 How to make a double-sized net as good as SF NNUE in a few easy steps] by [[Chris Whittington]], [[CCC]], February 28, 2021 » [[Fat Fritz#Fat Fritz 2|Fat Fritz 2.0]]<br />
===March===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76790 A random walk down NNUE street …] by [[Michael Byrne|MikeB]], [[CCC]], March 06, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76833 NNUE Research Project] by [[Ed Schroder|Ed Schröder]], [[CCC]], March 10, 2021 » [[Engine Similarity]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76840 Simex including NNUE] by jjoshua2, [[CCC]], March 11, 2021 » [[Engine Similarity]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76844 NNUE ranking] by Jim Logan, [[CCC]], March 12, 2021 » [[Stockfish NNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76892 FEN compression] by lucasart, [[CCC]], March 17, 2021 » [[BMI2#FEN Compression|FEN Compression]], [[#Training|NNUE Training]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=76917 Mabigat - hyperparameter optimizer for NNUE net] by [[Ferdinand Mosca]], [[CCC]], March 22, 2021 » [[Automated Tuning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76964 nnue-trainer] by [[Jon Dart]], [[CCC]], March 27, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[GPU]]<br />
===April===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77157 Rubichess NN questions] by [[Jon Dart]], [[CCC]], April 23, 2021 » [[RubiChess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77200 Crafty NNUE Chess Engine?] by supersharp77, [[CCC]], April 29, 2021 » [[Crafty]], [[Vafra]] <ref>[http://www.jurjevic.org.uk/chess/vafra/index.html Vafra] by [[Robert Jurjević]]</ref><br />
===May===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77244 Komodo Dragon 2 released] by [[Larry Kaufman]], [[CCC]], May 04, 2021 » [[Dragon by Komodo Chess|Komodo Dragon]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77344 Stockfish with new NNUE architecture and bigger net released] by [[Stefan Pohl]], [[CCC]], May 19, 2021 » [[Stockfish]], [[Stockfish NNUE]] <ref>[https://github.com/official-stockfish/Stockfish/pull/3474 Update default net to nn-8a08400ed089.nnue by Sopel97 · Pull Request #3474 · official-stockfish/Stockfish · GitHub] by [[Tomasz Sobczyk]]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77348 NNUE scoring (egbb lib)] by [[Michael Hoffmann|Desperado]], [[CCC]], May 19, 2021 » [[Scorpio#ScorpioNNUE|Scorpio NNUE]]<br />
===June===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77243&start=20 Re: Booot progress] by [[Alex Morozov]], [[CCC]], June 01, 2021 » [[Booot]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77438 Commercial Release of Ethereal 13.00 (NNUE) for AVX2 Systems] by [[Andrew Grant]], [[CCC]], June 04, 2021 » [[Ethereal#Ethereal 13 (NNUE)|Ethereal 13.00 (NNUE)]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77438&start=17 Re: Commercial Release of Ethereal 13.00 (NNUE) for AVX2 Systems] by [[Andrew Grant]], [[CCC]], June 04, 2021 » [[Stockfish NNUE#NNUE Structure|HalfKP]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77467 Dark Horse Update] by [[Dietrich Kappe]], [[CCC]], June 11, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77492 Some more experiments with neural nets] by [[Jonathan Kreuzer]], [[CCC]], June 15, 2021 » [[Slow Chess]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77503&start=55 Re: will Tcec allow Stockfish with a Leela net to play?] by [[Connor McMonigle]], [[CCC]], June 17, 2021 » [[Stockfish]], [[Dragon by Komodo Chess|Komodo Dragon]], [[Ethereal]], [[Seer]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77503&start=63 Re: will Tcec allow Stockfish with a Leela net to play?] by [[Daniel Shawul]], [[CCC]], June 18, 2021 » [[Scorpio]]<br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77503&start=68 Re: will Tcec allow Stockfish with a Leela net to play?] by [[Vivien Clauzon]], [[CCC]], June 18, 2021 » [[Minic]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77571 I declare that HCE is dead...] by [[Andrew Grant]], [[CCC]], June 29, 2021 » [[Ethereal]], [[Evaluation|HCE]]<br />
===July===<br />
* <span id="KingPlacementsCont"></span>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506&start=39 Re: NNUE Question - King Placements] by [[Tomasz Sobczyk]], [[CCC]], July 01, 2021 » [[#KingPlacements|NNUE Question]]<br />
: [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=75506&start=40 Re: NNUE Question - King Placements] by [[Daniel Shawul]], July 01, 2021 » [[Scorpio#ScorpioNNUE|ScorpioNNUE]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77602 Before things become more messy than they already are] by [[Ed Schroder|Ed Schröder]], [[CCC]], July 02, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77606 NNUE training set generation] by [[Edsel Apostol]], [[CCC]], July 03, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77612 Time to rethink what Albert Silver has done?] by [[Srdja Matovic]], [[CCC]], July 03, 2021 » [[Fat Fritz#Fat Fritz 2|Fat Fritz 2]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77639 Would the ICGA have accepted today's NNUE engines?] by Madeleine Birchfield, [[CCC]], July 05, 2021 » [[ICGA]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77664 Koivisto 5.0] by [[Finn Eggers]], [[CCC]], July 07, 2021 » [[Koivisto]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=77681 NNUE one year retrospective] by Madeleine Birchfield, [[CCC]], July 09, 2021<br />
===August ...===<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=77869 Basic NNUE questions] by [[Amanj Sherwany]], [[CCC]], August 04, 2021<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=78109 Alternatives to King-Pawn, King-All NNUE encoding] by [[Andrew Grant]], [[CCC]], September 05, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=78394 NNUE - Efficiently Updatable Network - understanding] by [[Daniel Infuehr]], [[CCC]], October 11, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78497 NNUE - only from own engine?] by [[Ed Schroder|Rebel]], October 25, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78588 Regarding AVX2] by [[Ed Schroder|Rebel]], [[CCC]], November 03, 2021 » [[AVX2]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78855 Mantissa 3.0.0] by [[Jeremy Wright]], [[CCC]], December 10, 2021 » [[Mantissa]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=78979 Are NNUE Nets Specific to Chess Engines or They Universal to All Engines?] by daniel71, [[CCC]], December 26, 2021 <br />
==2022 ...==<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79020 Why NNUE trainer requires an online qsearch on each training position?] by [[Chao Ma]], [[CCC]], January 01, 2022<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=79107 Rebel 14] by [[Ed Schroder|Ed Schröder]], [[CCC]], January 12, 2022 » [[Rebel#14|Rebel 14]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=2&t=79523 Koivisto 8.0] by [[Finn Eggers]], [[CCC]], March 15, 2022 » [[Koivisto]]<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79742 NNUE + Pawn-King Network] by Alvin Peng, [[CCC]], April 22, 2022<br />
==2023 ...==<br />
* [https://talkchess.com/forum3/viewtopic.php?f=7&t=83170 How to get started with NNUE] by Arjun Basandrai, [[CCC]], October 26, 2023<br />
<br />
=External Links=<br />
==NNUE==<br />
* [https://en.wikipedia.org/wiki/Efficiently_updatable_neural_network Efficiently updatable neural network | Wikipedia]<br />
* [http://qhapaq.hatenablog.com/entry/2018/06/02/221612 次世代の将棋思考エンジン、NNUE関数を学ぼう(その1.ネットワーク構造編) - コンピュータ将棋 Qhapaq], June 02, 2018 (Japanese)<br />
: Learn Next Generation Shogi Thinking Engine, NNUE Function (Part 1. Network Structure) - Computer Shogi<br />
* [http://qhapaq.hatenablog.com/entry/2018/07/08/193316 次世代の将棋思考エンジン、NNUE関数を学ぼう(その2.改造/学習編) - コンピュータ将棋 Qhapaq], July 08, 2018 (Japanese)<br />
: Let's Learn Next Generation Shogi Thinking Engine, NNUE Function (Part 2. Remodeling/Learning) - Computer Shogi<br />
* [http://yaneuraou.yaneu.com/2020/06/19/stockfish-nnue-the-complete-guide/ Stockfish NNUE – The Complete Guide], June 19, 2020 (Japanese and English)<br />
* [http://yaneuraou.yaneu.com/2020/08/21/3-technologies-in-shogi-ai-that-could-be-used-for-chess-ai/ 3 technologies in shogi AI that could be used for chess AI] by [[Motohiro Isozaki]], August 21, 2020 » [[Stockfish NNUE]]<br />
* [https://www.qhapaq.org/shogi/shogiwiki/stockfish-nnue/ Stockfish NNUE Wiki]<br />
* [http://rebel13.nl/home/nnue.html nnue | Home of the Dutch Rebel] by [[Ed Schroder|Ed Schröder]]<br />
* [https://github.com/glinscott/nnue-pytorch/blob/master/docs/nnue.md NNUE Guide (nnue-pytorch/nnue.md at master · glinscott/nnue-pytorch · GitHub)] hosted by [[Gary Linscott]]<br />
==NNUE libraries==<br />
Some developers disintegrate and rewrite the Stockfish NNUE code into independent libraries which can be much easier to embed into other chess engines.<br />
* [https://github.com/david-carteau/cerebrum GitHub - david-carteau/cerebrum: The Cerebrum library] by [[David Carteau]] » [[Cerebrum]]<br />
* [https://github.com/dshawul/nncpu-probe GitHub - dshawul/nncpu-probe] by [[Daniel Shawul]]<br />
* [https://github.com/jdart1/nnue GitHub - jdart1/nnue: NNUE reading code for chess] by [[Jon Dart]]<br />
==Source Code==<br />
* [https://github.com/yaneurao/YaneuraOu GitHub - yaneurao/YaneuraOu: YaneuraOu is the World's Strongest Shogi engine(AI player), WCSC29 1st winner, educational and USI compliant engine]<br />
* [https://github.com/Tama4649/Kristallweizen/ GitHub - Tama4649/Kristallweizen: 第29回世界コンピュータ将棋選手権 準優勝のKristallweizenです。]<br />
* [https://github.com/nodchip/Stockfish GitHub - nodchip/Stockfish: UCI chess engine] ([[Stockfish NNUE]] by [[Hisayori Noda|Nodchip]])<br />
* [https://github.com/dkappe/leela-chess-weights/wiki/A-Leela-NNUE%3F-Night-Nurse-and-Others A Leela NNUE? Night Nurse and Others · dkappe/leela-chess-weights Wiki · GitHub] by [[Dietrich Kappe]]<br />
* [https://github.com/DanielUranga/TensorFlowNNUE GitHub - DanielUranga/TensorFlowNNUE] by [[Daniel Uranga]]<br />
* [https://github.com/glinscott/nnue-pytorch GitHub - glinscott/nnue-pytorch: NNUE (Chess evaluation) trainer in Pytorch] by [[Gary Linscott]] <br />
* [https://github.com/connormcmonigle/seer-nnue GitHub - connormcmonigle/seer-nnue: UCI chess engine using neural networks for position evaluation] by [[Connor McMonigle]] » [[Seer]]<br />
* [https://github.com/bmdanielsson/nnue-trainer GitHub - bmdanielsson/nnue-trainer: PyTorch trainer for NNUE style neural networks] by [[Martin Danielsson]] » [[Marvin]] <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76964 nnue-trainer] by [[Jon Dart]], [[CCC]], March 27, 2021</ref><br />
* [https://github.com/fsmosca/Mabigat GitHub - fsmosca/Mabigat: NNUE parameter optimizer] by [[Ferdinand Mosca]] » [[Automated Tuning]]<br />
==Misc==<br />
* [https://software.intel.com/content/www/us/en/develop/articles/lower-numerical-precision-deep-learning-inference-and-training.html Lower Numerical Precision Deep Learning Inference and Training] by [https://community.intel.com/t5/user/viewprofilepage/user-id/134067 Andres Rodriguez] et al., [[Intel]], January 19, 2018 » [[AVX-512]]<br />
* [https://en.wikipedia.org/wiki/Nue Nue from Wikipedia]<br />
* [[:Category:Hiromi Uehara|Hiromi]] - [https://en.wikipedia.org/wiki/Spectrum_(Hiromi_album) Spectrum], 2019, [https://en.wikipedia.org/wiki/YouTube YouTube] Video<br />
: {{#evu:https://www.youtube.com/watch?v=A8RCz_RoefM|alignment=left|valignment=top}} <br />
<br />
=References= <br />
<references /><br />
'''[[Neural Networks|Up one Level]]'''<br />
[[Category:Toriyama Sekien]]<br />
[[Category:Hiromi Uehara]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Main_Page&diff=26899Main Page2024-01-13T08:12:20Z<p>Smatovic: removed "daily"</p>
<hr />
<div>The '''Chess Programming Wiki''' is a repository of information about [[Programming|programming]] computers to play [[Chess|chess]]. Our goal is to provide a reference for every aspect of chess-programming, information about [[:Category:Programmer|programmers]], [[:Category:Researcher|researcher]] and [[Engines|engines]]. You'll find different ways to implement [[Late Move Reductions|LMR]] and [[Bitboards|bitboard]] stuff like [[Best Magics so far|best magics]] for most dense [[Magic Bitboards|magic bitboard]] tables. For didactic purposes, the [[CPW-Engine]] has been developed by some wiki members. You can start browsing using the left-hand navigation bar. All of our content is arranged hierarchically, so you can see every page by following just those links. If you are looking for a specific page or catchword you can use the search box on top.<br />
<br />
CPW was founded by [[Mark Lefler]] on October 26, 2007 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17344&start=4 Re: community test result web page?] by [[Mark Lefler]], [[CCC]], October 26, 2007</ref>, first hosted on [https://en.wikipedia.org/wiki/Wikispaces Wikispaces] <ref>[http://web.archive.org/web/20180216204915/http://chessprogramming.wikispaces.com/ Wikispaces Chessprogramming - home] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine], February 16, 2018)</ref>. Due to that site closure <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66573 Chess Programming Wiki] by [[Jon Dart]], [[CCC]], February 12, 2018</ref>, it moved to its present new host at '''www.chessprogramming.org'''.<br />
<br />
=Up-To-Date Best Practices=<br />
Over time the Zeitgeist of the computer chess community moved on from usenet to bulletin boards to meanwhile chat groups like Discord channels. Computer chess programming is actively discussed in several Discord channels:<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
<br />
=Hot Topics=<br />
Topics people search/discuss much<br />
* [[Stockfish]]<br />
* [[NNUE]]<br />
* [[Leela Chess Zero]]<br />
* [[Syzygy Bases]]<br />
<br />
=Basics=<br />
* [[Getting Started]] - if you are new to chess programming<br />
* [[Board Representation]]<br />
* [[Search]]<br />
* [[Evaluation]]<br />
* [[Opening Book]]<br />
* [[Endgame Tablebases]]<br />
<br />
=Principal Topics=<br />
* [[Chess]] <br />
* [[Programming]]<br />
* [[Artificial Intelligence]]<br />
* [[Knowledge]]<br />
* [[Learning]]<br />
* [[Engine Testing|Testing]]<br />
* [[Automated Tuning|Tuning]]<br />
* [[User Interface]]<br />
* [[Protocols]]<br />
<br />
=Lists=<br />
* [[Cartoons]]<br />
* [[Computer Chess Forums]]<br />
* [[Conferences]]<br />
* [[Dictionary]]<br />
* [[Engines]] including the [[CPW-Engine]]<br />
** [[Dedicated Chess Computers]]<br />
** [[Engine Releases]]<br />
* [[Games]], some other [[Artificial Intelligence|AI]]-Games, where computer chess may borrow some ideas <br />
* [[Hardware]]<br />
* [[History]]<br />
* [[Organizations]]<br />
* [[People]]<br />
* [[Periodical]]<br />
* [[Software]]<br />
* [[Tournaments and Matches]]<br />
<br />
=Miscellaneous=<br />
* [[Acknowledgments]]<br />
* [[Recommended Reading]]<br />
<br />
=Statistics=<br />
* Articles: {{NUMBEROFARTICLES}}<br />
* Pages: {{NUMBEROFPAGES}}<br />
* Files: {{NUMBEROFFILES}}<br />
<br />
=Thanks=<br />
Thanks for visiting our site!<br />
We hope you like the work we have done.<br />
<br />
[[Mark Lefler]] and the rest of the CPW team.<br />
<br />
=References=<br />
<references /><br />
[[Category:Root]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Main_Page&diff=26898Main Page2024-01-13T08:02:12Z<p>Smatovic: added link to discord channels</p>
<hr />
<div>The '''Chess Programming Wiki''' is a repository of information about [[Programming|programming]] computers to play [[Chess|chess]]. Our goal is to provide a reference for every aspect of chess-programming, information about [[:Category:Programmer|programmers]], [[:Category:Researcher|researcher]] and [[Engines|engines]]. You'll find different ways to implement [[Late Move Reductions|LMR]] and [[Bitboards|bitboard]] stuff like [[Best Magics so far|best magics]] for most dense [[Magic Bitboards|magic bitboard]] tables. For didactic purposes, the [[CPW-Engine]] has been developed by some wiki members. You can start browsing using the left-hand navigation bar. All of our content is arranged hierarchically, so you can see every page by following just those links. If you are looking for a specific page or catchword you can use the search box on top. You will notice updating progress almost daily.<br />
<br />
CPW was founded by [[Mark Lefler]] on October 26, 2007 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=17344&start=4 Re: community test result web page?] by [[Mark Lefler]], [[CCC]], October 26, 2007</ref>, first hosted on [https://en.wikipedia.org/wiki/Wikispaces Wikispaces] <ref>[http://web.archive.org/web/20180216204915/http://chessprogramming.wikispaces.com/ Wikispaces Chessprogramming - home] ([https://en.wikipedia.org/wiki/Wayback_Machine Wayback Machine], February 16, 2018)</ref>. Due to that site closure <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66573 Chess Programming Wiki] by [[Jon Dart]], [[CCC]], February 12, 2018</ref>, it moved to its present new host at '''www.chessprogramming.org'''.<br />
<br />
=Up-To-Date Best Practices=<br />
Over time the Zeitgeist of the computer chess community moved on from usenet to bulletin boards to meanwhile chat groups like Discord channels. Computer chess programming is actively discussed in several Discord channels:<br />
<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=82700 Discord Channels] by [[Srdja Matovic]] on [[CCC]], October 11, 2023<br />
<br />
=Hot Topics=<br />
Topics people search/discuss much<br />
* [[Stockfish]]<br />
* [[NNUE]]<br />
* [[Leela Chess Zero]]<br />
* [[Syzygy Bases]]<br />
<br />
=Basics=<br />
* [[Getting Started]] - if you are new to chess programming<br />
* [[Board Representation]]<br />
* [[Search]]<br />
* [[Evaluation]]<br />
* [[Opening Book]]<br />
* [[Endgame Tablebases]]<br />
<br />
=Principal Topics=<br />
* [[Chess]] <br />
* [[Programming]]<br />
* [[Artificial Intelligence]]<br />
* [[Knowledge]]<br />
* [[Learning]]<br />
* [[Engine Testing|Testing]]<br />
* [[Automated Tuning|Tuning]]<br />
* [[User Interface]]<br />
* [[Protocols]]<br />
<br />
=Lists=<br />
* [[Cartoons]]<br />
* [[Computer Chess Forums]]<br />
* [[Conferences]]<br />
* [[Dictionary]]<br />
* [[Engines]] including the [[CPW-Engine]]<br />
** [[Dedicated Chess Computers]]<br />
** [[Engine Releases]]<br />
* [[Games]], some other [[Artificial Intelligence|AI]]-Games, where computer chess may borrow some ideas <br />
* [[Hardware]]<br />
* [[History]]<br />
* [[Organizations]]<br />
* [[People]]<br />
* [[Periodical]]<br />
* [[Software]]<br />
* [[Tournaments and Matches]]<br />
<br />
=Miscellaneous=<br />
* [[Acknowledgments]]<br />
* [[Recommended Reading]]<br />
<br />
=Statistics=<br />
* Articles: {{NUMBEROFARTICLES}}<br />
* Pages: {{NUMBEROFPAGES}}<br />
* Files: {{NUMBEROFFILES}}<br />
<br />
=Thanks=<br />
Thanks for visiting our site!<br />
We hope you like the work we have done.<br />
<br />
[[Mark Lefler]] and the rest of the CPW team.<br />
<br />
=References=<br />
<references /><br />
[[Category:Root]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26897GPU2024-01-06T20:26:04Z<p>Smatovic: /* Vega GCN 5th gen */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set Architecture]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set Architecture]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf GCN3/4 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26896GPU2024-01-06T20:25:46Z<p>Smatovic: /* Navi RDNA */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set Architecture]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf GCN3/4 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26895GPU2024-01-06T20:24:46Z<p>Smatovic: /* Polaris GCN 4th gen */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf GCN3/4 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26894GPU2024-01-06T20:23:40Z<p>Smatovic: /* Vega GCN 5th gen */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf AMD GCN3 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26893GPU2024-01-06T20:23:24Z<p>Smatovic: /* Vega GCN 5th gen */ link update</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* {https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf AMD GCN3 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26892GPU2024-01-06T20:22:28Z<p>Smatovic: /* Polaris GCN 4th gen */ link update</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf AMD GCN3 Instruction Set Architecture]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26891GPU2024-01-06T20:19:16Z<p>Smatovic: /* Southern Islands GCN 1st gen */ link update</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://developer.amd.com/wordpress/media/2012/12/AMD_Southern_Islands_Instruction_Set_Architecture.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26890GPU2024-01-05T11:05:56Z<p>Smatovic: /* Intel XMX Cores */ added Max</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist] and [https://www.intel.com/content/www/us/en/products/sku/232876/intel-data-center-gpu-max-1100/specifications.html Intel Data Center GPU Max Series].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://pdfslide.net/documents/reference-guide-amd-revision-11-southern-islands-series-instruction-set-architecture.html?page=1 Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26889GPU2024-01-05T10:55:10Z<p>Smatovic: /* GPU in Computer Chess */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
* Neural network training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> or [https://github.com/LeelaChessZero/lczero-training Lc0 TensorFlow Training], using GPU resources to efficiently train networks<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://pdfslide.net/documents/reference-guide-amd-revision-11-southern-islands-series-instruction-set-architecture.html?page=1 Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26888GPU2024-01-05T01:33:59Z<p>Smatovic: /* AMD Matrix Cores */ added CDNA3</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration. AMD's CDNA 3 architecture adds support for FP8 and sparse matrix data (sparsity).<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://pdfslide.net/documents/reference-guide-amd-revision-11-southern-islands-series-instruction-set-architecture.html?page=1 Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26887GPU2024-01-05T00:59:18Z<p>Smatovic: /* Nvidia */ added Grace Hopper Superchip</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration.<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://pdfslide.net/documents/reference-guide-amd-revision-11-southern-islands-series-instruction-set-architecture.html?page=1 Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Grace Hopper Superchip ===<br />
The Nvidia GH200 Grace Hopper Superchip was unveiled August, 2023 and combines the Nvidia Grace CPU ([[ARM|ARM v9]]) and Nvidia Hopper GPU architectures via NVLink to deliver a CPU+GPU coherent memory model for accelerated AI and HPC applications.<br />
<br />
* [https://resources.nvidia.com/en-us-grace-cpu/grace-hopper-superchip NVIDIA Grace Hopper Superchip Data Sheet]<br />
* [https://resources.nvidia.com/en-us-grace-cpu/nvidia-grace-hopper NVIDIA Grace Hopper Superchip Architecture Whitepaper]<br />
<br />
=== Ada Lovelace Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26886GPU2024-01-05T00:19:52Z<p>Smatovic: /* Southern Islands GCN 1st gen */ link update</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration.<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/programmer-references/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://pdfslide.net/documents/reference-guide-amd-revision-11-southern-islands-series-instruction-set-architecture.html?page=1 Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Ada Lovelace Architecture ===<br />
<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26885GPU2024-01-05T00:06:00Z<p>Smatovic: /* Vega GCN 5th gen */ link update</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration.<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://www.amd.com/system/files/TechDocs/vega-7nm-shader-instruction-set-architecture.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/10/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/07/AMD_Southern_Islands_Instruction_Set_Architecture1.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Ada Lovelace Architecture ===<br />
<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26884GPU2024-01-05T00:02:33Z<p>Smatovic: /* CDNA3 */</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration.<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/instruction-set-architectures/amd-instinct-mi300-cdna3-instruction-set-architecture.pdf AMD Instinct MI300/CDNA3 Instruction Set Architecture]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/10/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/07/AMD_Southern_Islands_Instruction_Set_Architecture1.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Ada Lovelace Architecture ===<br />
<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26883GPU2024-01-05T00:01:38Z<p>Smatovic: /* AMD */ link rot update</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocmdocs.amd.com/en/latest/index.html AMD ROCm™ documentation]<br />
* [https://manualzz.com/doc/o/cggy6/amd-opencl-programming-user-guide-contents AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/amd-isa-documentation/ AMD GPU ISA documentation]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration.<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/10/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/07/AMD_Southern_Islands_Instruction_Set_Architecture1.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Ada Lovelace Architecture ===<br />
<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=GPU&diff=26882GPU2024-01-04T23:40:45Z<p>Smatovic: /* AMD */ added CDNA3</p>
<hr />
<div>'''[[Main Page|Home]] * [[Hardware]] * GPU'''<br />
<br />
[[FILE:NvidiaTesla.jpg|border|right|thumb| [https://en.wikipedia.org/wiki/Nvidia_Tesla Nvidia Tesla] <ref>[https://commons.wikimedia.org/wiki/File:NvidiaTesla.jpg Image] by Mahogny, February 09, 2008, [https://en.wikipedia.org/wiki/Wikimedia_Commons Wikimedia Commons]</ref> ]] <br />
<br />
'''GPU''' (Graphics Processing Unit),<br/><br />
a specialized processor initially intended for fast [https://en.wikipedia.org/wiki/Image_processing image processing]. GPUs may have more raw computing power than general purpose [https://en.wikipedia.org/wiki/Central_processing_unit CPUs] but need a specialized and parallelized way of programming. [[Leela Chess Zero]] has proven that a [[Best-First|Best-first]] [[Monte-Carlo Tree Search|Monte-Carlo Tree Search]] (MCTS) with [[Deep Learning|deep learning]] methodology will work with GPU architectures.<br />
<br />
=History=<br />
In the 1970s and 1980s RAM was expensive and Home Computers used custom graphics chips to operate directly on registers/memory without a dedicated frame buffer resp. texture buffer, like [https://en.wikipedia.org/wiki/Television_Interface_Adaptor TIA]in the [[Atari 8-bit|Atari VCS]] gaming system, [https://en.wikipedia.org/wiki/CTIA_and_GTIA GTIA]+[https://en.wikipedia.org/wiki/ANTIC ANTIC] in the [[Atari 8-bit|Atari 400/800]] series, or [https://en.wikipedia.org/wiki/Original_Chip_Set#Denise Denise]+[https://en.wikipedia.org/wiki/Original_Chip_Set#Agnus Agnus] in the [[Amiga|Commodore Amiga]] series. The 1990s would make 3D graphics and 3D modeling more popular, especially for video games. Cards specifically designed to accelerate 3D math, such as [https://en.wikipedia.org/wiki/IMPACT_(computer_graphics) SGI Impact] (1995) in 3D graphics-workstations or [https://en.wikipedia.org/wiki/3dfx#Voodoo_Graphics_PCI 3dfx Voodoo] (1996) for playing 3D games on PCs, emerged. Some game engines could use instead the [[SIMD and SWAR Techniques|SIMD-capabilities]] of CPUs such as the [[Intel]] [[MMX]] instruction set or [[AMD|AMD's]] [[X86#3DNow!|3DNow!]] for [https://en.wikipedia.org/wiki/Real-time_computer_graphics real-time rendering]. Sony's 3D capable chip [https://en.wikipedia.org/wiki/PlayStation_technical_specifications#Graphics_processing_unit_(GPU) GTE] used in the PlayStation (1994) and Nvidia's 2D/3D combi chips like [https://en.wikipedia.org/wiki/NV1 NV1] (1995) coined the term GPU for 3D graphics hardware acceleration. With the advent of the [https://en.wikipedia.org/wiki/Unified_shader_model unified shader architecture], like in Nvidia [https://en.wikipedia.org/wiki/Tesla_(microarchitecture) Tesla] (2006), ATI/AMD [https://en.wikipedia.org/wiki/TeraScale_(microarchitecture) TeraScale] (2007) or Intel [https://en.wikipedia.org/wiki/Intel_GMA#GMA_X3000 GMA X3000] (2006), GPGPU frameworks like [https://en.wikipedia.org/wiki/CUDA CUDA] and [[OpenCL|OpenCL]] emerged and gained in popularity.<br />
<br />
=GPU in Computer Chess= <br />
<br />
There are some main approaches to using a GPU for chess:<br />
<br />
* As an accelerator in [[Leela_Chess_Zero|Lc0]]: run a neural network for position evaluation on GPU<br />
* Offload the search in [[Zeta|Zeta]]: run a parallel game tree search with move generation and position evaluation on GPU<br />
* NNUE training such as [https://github.com/glinscott/nnue-pytorch NNUE trainer in Pytorch]<ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=75724 Pytorch NNUE training] by [[Gary Linscott]], [[CCC]], November 08, 2020</ref> using GPU resources to efficiently train networks <br />
* As a hybrid in [http://www.talkchess.com/forum3/viewtopic.php?t=64983&start=4#p729152 perft_gpu]: expand the game tree to a certain degree on CPU and offload to GPU to compute the sub-tree<br />
<br />
=GPU Chess Engines=<br />
* [[:Category:GPU]]<br />
<br />
=GPGPU= <br />
<br />
Early efforts to leverage a GPU for general-purpose computing required reformulating computational problems in terms of graphics primitives via graphics APIs like [https://en.wikipedia.org/wiki/OpenGL OpenGL] or [https://en.wikipedia.org/wiki/DirectX DirextX], followed by first GPGPU frameworks such as [https://en.wikipedia.org/wiki/Lib_Sh Sh/RapidMind] or [https://en.wikipedia.org/wiki/BrookGPU Brook] and finally [https://en.wikipedia.org/wiki/CUDA CUDA] and [https://www.chessprogramming.org/OpenCL OpenCL].<br />
<br />
== Khronos OpenCL ==<br />
[[OpenCL|OpenCL]] specified by the [https://en.wikipedia.org/wiki/Khronos_Group Khronos Group] is widely adopted across all kind of hardware accelerators from different vendors.<br />
<br />
* [https://www.khronos.org/conformance/adopters/conformant-products/opencl List of OpenCL Conformant Products]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf OpenCL 1.2 Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/1.2/docs/man/xhtml/ OpenCL 1.2 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/opencl-2.0.pdf OpenCL 2.0 Specification]<br />
* [https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_C.pdf OpenCL 2.0 C Language Specification]<br />
* [https://www.khronos.org/registry/OpenCL//sdk/2.0/docs/man/xhtml/ OpenCL 2.0 Reference]<br />
<br />
* [https://www.khronos.org/registry/OpenCL/specs/3.0-unified/pdf/ OpenCL 3.0 Specifications]<br />
<br />
== AMD ==<br />
<br />
[[AMD]] supports language frontends like OpenCL, HIP, C++ AMP and with OpenMP offload directives. It offers with [https://rocmdocs.amd.com/en/latest/ ROCm] its own parallel compute platform.<br />
<br />
* [https://community.amd.com/t5/opencl/bd-p/opencl-discussions AMD OpenCL Developer Community]<br />
* [https://rocm.github.io/ ROCm Homepage]<br />
* [http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf AMD OpenCL Programming Guide]<br />
* [http://developer.amd.com/wordpress/media/2013/12/AMD_OpenCL_Programming_Optimization_Guide2.pdf AMD OpenCL Optimization Guide]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
== Apple ==<br />
Since macOS 10.14 Mojave a transition from OpenCL to [https://en.wikipedia.org/wiki/Metal_(API) Metal] is recommended by [[Apple]].<br />
<br />
* [https://developer.apple.com/opencl/ Apple OpenCL Developer] <br />
* [https://developer.apple.com/metal/ Apple Metal Developer]<br />
* [https://developer.apple.com/library/archive/documentation/Miscellaneous/Conceptual/MetalProgrammingGuide/Introduction/Introduction.html Apple Metal Programming Guide]<br />
* [https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf Metal Shading Language Specification]<br />
<br />
== Intel ==<br />
Intel supports OpenCL with implementations like BEIGNET and NEO for different GPU architectures and the [https://en.wikipedia.org/wiki/OneAPI_(compute_acceleration) oneAPI] platform with [https://en.wikipedia.org/wiki/DPC++ DPC++] as frontend language.<br />
<br />
* [https://www.intel.com/content/www/us/en/developer/overview.html#gs.pu62bi Intel Developer Zone]<br />
* [https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top.html Intel oneAPI Programming Guide]<br />
<br />
== Nvidia ==<br />
<br />
[https://en.wikipedia.org/wiki/CUDA CUDA] is the parallel computing platform by [[Nvidia]]. It supports language frontends like C, C++, Fortran, OpenCL and offload directives via [https://en.wikipedia.org/wiki/OpenACC OpenACC] and [https://en.wikipedia.org/wiki/OpenMP OpenMP].<br />
<br />
* [https://developer.nvidia.com/cuda-zone Nvidia CUDA Zone]<br />
* [https://docs.nvidia.com/cuda/parallel-thread-execution/index.html Nvidia PTX ISA]<br />
* [https://docs.nvidia.com/cuda/index.html Nvidia CUDA Toolkit Documentation]<br />
* [https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html Nvidia CUDA C++ Programming Guide]<br />
* [https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html Nvidia CUDA C++ Best Practices Guide]<br />
<br />
== Further == <br />
* [https://en.wikipedia.org/wiki/Vulkan#Planned_features Vulkan] (OpenGL sucessor of Khronos Group)<br />
* [https://en.wikipedia.org/wiki/DirectCompute DirectCompute] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/C%2B%2B_AMP C++ AMP] (Microsoft)<br />
* [https://en.wikipedia.org/wiki/OpenACC OpenACC] (offload directives)<br />
* [https://en.wikipedia.org/wiki/OpenMP OpenMP] (offload directives)<br />
<br />
=Hardware Model=<br />
<br />
A common scheme on GPUs with unified shader architecture is to run multiple threads in [https://en.wikipedia.org/wiki/Single_instruction,_multiple_threads SIMT] fashion and a multitude of SIMT waves on the same [https://en.wikipedia.org/wiki/SIMD SIMD] unit to hide memory latencies. Multiple processing elements (GPU cores) are members of a SIMD unit, multiple SIMD units are coupled to a compute unit, with up to hundreds of compute units present on a discrete GPU. The actual SIMD units may have architecture dependent different numbers of cores (SIMD8, SIMD16, SIMD32), and different computation abilities - floating-point and/or integer with specific bit-width of the FPU/ALU and registers. There is a difference between a vector-processor with variable bit-width and SIMD units with fix bit-width cores. Different architecture white papers from different vendors leave room for speculation about the concrete underlying hardware implementation and the concrete classification as [https://en.wikipedia.org/wiki/Flynn%27s_taxonomy hardware architecture]. Scalar units present in the compute unit perform special functions the SIMD units are not capable of and MMAC units (matrix-multiply-accumulate units) are used to speed up neural networks further.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Vendor Terminology<br />
|-<br />
! AMD Terminology !! Nvidia Terminology<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Stream Core || CUDA Core<br />
|-<br />
| Wavefront || Warp<br />
|}<br />
<br />
===Hardware Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi]) <ref>[https://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf Fermi white paper from Nvidia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_500_series GeForce 500 series on Wikipedia]</ref><br />
<br />
* 512 CUDA cores @1.544GHz<br />
* 16 SMs - Streaming Multiprocessors<br />
* organized in 2x16 CUDA cores per SM<br />
* Warp size of 32 threads<br />
<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN)]<ref>[https://en.wikipedia.org/wiki/Graphics_Core_Next Graphics Core Next on Wikipedia]</ref><ref>[https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Radeon_HD_7000_series Radeon HD 7000 series on Wikipedia]</ref><br />
<br />
* 2048 Stream cores @0.925GHz<br />
* 32 Compute Units<br />
* organized in 4xSIMD16, each SIMT4, per Compute Unit<br />
* Wavefront size of 64 work-items<br />
<br />
===Wavefront and Warp===<br />
Generalized the definition of the Wavefront and Warp size is the amount of threads executed in SIMT fashion on a GPU with unified shader architecture.<br />
<br />
=Programming Model=<br />
<br />
A [https://en.wikipedia.org/wiki/Parallel_programming_model parallel programming model] for GPGPU can be [https://en.wikipedia.org/wiki/Data_parallelism data-parallel], [https://en.wikipedia.org/wiki/Task_parallelism task-parallel], a mixture of both, or with libraries and offload-directives also [https://en.wikipedia.org/wiki/Implicit_parallelism implicitly-parallel]. Single GPU threads (work-items in OpenCL) contain the kernel to be computed and are coupled to a work-group, one or multiple work-groups form the NDRange to be executed on the GPU device. The members of a work-group execute the same kernel, can be usually synchronized and have access to the same scratch-pad memory, with an architecture limit of how many work-items a work-group can hold and how many threads can run in total concurrently on the device.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Kernel || Kernel<br />
|-<br />
| Compute Unit || Streaming Multiprocessor<br />
|-<br />
| Processing Element || CUDA Core<br />
|-<br />
| Work-Item || Thread<br />
|-<br />
| Work-Group || Block<br />
|-<br />
| NDRange || Grid<br />
|-<br />
|}<br />
<br />
==Thread Examples==<br />
<br />
Nvidia GeForce GTX 580 (Fermi, CC2) <ref>[https://en.wikipedia.org/wiki/CUDA#Technical_Specification CUDA Technical_Specification on Wikipedia]</ref><br />
<br />
* Warp size: 32<br />
* Maximum number of threads per block: 1024<br />
* Maximum number of resident blocks per multiprocessor: 32<br />
* Maximum number of resident warps per multiprocessor: 64<br />
* Maximum number of resident threads per multiprocessor: 2048<br />
<br />
<br />
AMD Radeon HD 7970 (GCN) <ref>[https://www.olcf.ornl.gov/wp-content/uploads/2019/10/ORNL_Application_Readiness_Workshop-AMD_GPU_Basics.pdf AMD GPU Hardware Basics]</ref><br />
<br />
* Wavefront size: 64<br />
* Maximum number of work-items per work-group: 1024<br />
* Maximum number of work-groups per compute unit: 40<br />
* Maximum number of Wavefronts per compute unit: 40<br />
* Maximum number of work-items per compute unit: 2560<br />
<br />
=Memory Model=<br />
<br />
OpenCL offers the following memory model for the programmer:<br />
<br />
* __private - usually registers, accessable only by a single work-item resp. thread.<br />
* __local - scratch-pad memory shared across work-items of a work-group resp. threads of block.<br />
* __constant - read-only memory.<br />
* __global - usually VRAM, accessable by all work-items resp. threads.<br />
<br />
{| class="wikitable" style="margin:auto"<br />
|+ Terminology<br />
|-<br />
! OpenCL Terminology !! CUDA Terminology<br />
|-<br />
| Private Memory || Registers<br />
|-<br />
| Local Memory || Shared Memory<br />
|-<br />
| Constant Memory || Constant Memory<br />
|-<br />
| Global Memory || Global Memory<br />
|}<br />
<br />
===Memory Examples===<br />
<br />
Nvidia GeForce GTX 580 ([https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi)] <ref>CUDA C Programming Guide v7.0, Appendix G.COMPUTE CAPABILITIES</ref><br />
* 128 KiB private memory per compute unit<br />
* 48 KiB (16 KiB) local memory per compute unit (configurable)<br />
* 64 KiB constant memory<br />
* 8 KiB constant cache per compute unit<br />
* 16 KiB (48 KiB) L1 cache per compute unit (configurable)<br />
* 768 KiB L2 cache in total<br />
* 1.5 GiB to 3 GiB global memory<br />
AMD Radeon HD 7970 ([https://en.wikipedia.org/wiki/Graphics_Core_Next GCN]) <ref>AMD Accelerated Parallel Processing OpenCL Programming Guide rev2.7, Appendix D Device Parameters, Table D.1 Parameters for 7xxx Devices</ref><br />
* 256 KiB private memory per compute unit<br />
* 64 KiB local memory per compute unit<br />
* 64 KiB constant memory<br />
* 16 KiB constant cache per four compute units<br />
* 16 KiB L1 cache per compute unit<br />
* 768 KiB L2 cache in total<br />
* 3 GiB to 6 GiB global memory<br />
<br />
===Unified Memory===<br />
<br />
Usually data has to be copied between a CPU host and a discrete GPU device, but different architectures from different vendors with different frameworks on different operating systems may offer a unified and accessible address space between CPU and GPU.<br />
<br />
=Instruction Throughput= <br />
GPUs are used in [https://en.wikipedia.org/wiki/High-performance_computing HPC] environments because of their good [https://en.wikipedia.org/wiki/FLOP FLOP]/Watt ratio. The instruction throughput in general depends on the architecture (like Nvidia's [https://en.wikipedia.org/wiki/Tesla_%28microarchitecture%29 Tesla], [https://en.wikipedia.org/wiki/Fermi_%28microarchitecture%29 Fermi], [https://en.wikipedia.org/wiki/Kepler_%28microarchitecture%29 Kepler], [https://en.wikipedia.org/wiki/Maxwell_%28microarchitecture%29 Maxwell] or AMD's [https://en.wikipedia.org/wiki/TeraScale_%28microarchitecture%29 TeraScale], [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN], [https://en.wikipedia.org/wiki/AMD_RDNA_Architecture RDNA]), the brand (like Nvidia [https://en.wikipedia.org/wiki/GeForce GeForce], [https://en.wikipedia.org/wiki/Nvidia_Quadro Quadro], [https://en.wikipedia.org/wiki/Nvidia_Tesla Tesla] or AMD [https://en.wikipedia.org/wiki/Radeon Radeon], [https://en.wikipedia.org/wiki/Radeon_Pro Radeon Pro], [https://en.wikipedia.org/wiki/Radeon_Instinct Radeon Instinct]) and the specific model.<br />
<br />
==Integer Instruction Throughput==<br />
* INT32<br />
: The 32-bit integer performance can be architecture and operation depended less than 32-bit FLOP or 24-bit integer performance.<br />
<br />
* INT64<br />
: In general [https://en.wikipedia.org/wiki/Processor_register registers] and Vector-[https://en.wikipedia.org/wiki/Arithmetic_logic_unit ALUs] of consumer brand GPUs are 32-bit wide and have to emulate 64-bit integer operations.<br />
* INT8<br />
: Some architectures offer higher throughput with lower precision. They quadruple the INT8 or octuple the INT4 throughput.<br />
<br />
==Floating-Point Instruction Throughput==<br />
<br />
* FP32<br />
: Consumer GPU performance is measured usually in single-precision (32-bit) floating-point FMA (fused-multiply-add) throughput.<br />
<br />
* FP64<br />
: Consumer GPUs have in general a lower ratio (FP32:FP64) for double-precision (64-bit) floating-point operations throughput than server brand GPUs.<br />
<br />
* FP16<br />
: Some GPGPU architectures offer half-precision (16-bit) floating-point operation throughput with an FP32:FP16 ratio of 1:2.<br />
<br />
==Throughput Examples==<br />
Nvidia GeForce GTX 580 (Fermi, CC 2.0) - 32-bit integer operations/clock cycle per compute unit <ref>CUDA C Programming Guide v7.0, Chapter 5.4.1. Arithmetic Instructions</ref><br />
<br />
MAD 16<br />
MUL 16<br />
ADD 32<br />
Bit-shift 16<br />
Bitwise XOR 32<br />
<br />
Max theoretic ADD operation throughput: 32 Ops x 16 CUs x 1544 MHz = 790.528 GigaOps/sec<br />
<br />
AMD Radeon HD 7970 (GCN 1.0) - 32-bit integer operations/clock cycle per processing element <ref>AMD_OpenCL_Programming_Optimization_Guide.pdf 3.0beta, Chapter 2.7.1 Instruction Bandwidths</ref><br />
<br />
MAD 1/4<br />
MUL 1/4<br />
ADD 1<br />
Bit-shift 1<br />
Bitwise XOR 1<br />
<br />
Max theoretic ADD operation throughput: 1 Op x 2048 PEs x 925 MHz = 1894.4 GigaOps/sec<br />
<br />
=Tensors=<br />
MMAC (matrix-multiply-accumulate) units are used in consumer brand GPUs for neural network based upsampling of video game resolutions, in professional brands for upsampling of images and videos, and in server brand GPUs for accelerating convolutional neural networks in general. Convolutions can be implemented as a series of matrix-multiplications via Winograd-transformations <ref>[https://talkchess.com/forum3/viewtopic.php?f=7&t=66025&p=743355#p743355 Re: To TPU or not to TPU...] by [[Rémi Coulom]], [[CCC]], December 16, 2017</ref>. Mobile SoCs usually have an dedicated neural network engine as MMAC unit.<br />
<br />
==Nvidia TensorCores==<br />
: With Nvidia [https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] series TensorCores were introduced. They offer FP16xFP16+FP32, matrix-multiplication-accumulate-units, used to accelerate neural networks.<ref>[https://on-demand.gputechconf.com/gtc/2017/presentation/s7798-luke-durant-inside-volta.pdf INSIDE VOLTA]</ref> Turing's 2nd gen TensorCores add FP16, INT8, INT4 optimized computation.<ref>[https://www.anandtech.com/show/13282/nvidia-turing-architecture-deep-dive/6 AnandTech - Nvidia Turing Deep Dive page 6]</ref> Amperes's 3rd gen adds support for BF16, TF32, FP64 and sparsity acceleration.<ref>[https://en.wikipedia.org/wiki/Ampere_(microarchitecture)#Details Wikipedia - Ampere microarchitecture]</ref>Ada Lovelaces's 4th gen adds support for FP8.<ref>[https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) - Ada Lovelace microarchitecture]</ref><br />
<br />
==AMD Matrix Cores==<br />
: AMD released 2020 its server-class [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf CDNA] architecture with Matrix Cores which support MFMA (matrix-fused-multiply-add) operations on various data types like INT8, FP16, BF16, FP32. AMD's CDNA 2 architecture adds FP64 optimized throughput for matrix operations. AMD's RDNA 3 architecture features dedicated AI tensor operation acceleration.<br />
<br />
==Intel XMX Cores==<br />
: Intel added XMX, Xe Matrix eXtensions, cores to some of the [https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] GPU series, like [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist].<br />
<br />
=Host-Device Latencies= <br />
One reason GPUs are not used as accelerators for chess engines is the host-device latency, aka. kernel-launch-overhead. Nvidia and AMD have not published official numbers, but in practice there is a measurable latency for null-kernels of 5 microseconds <ref>[https://devtalk.nvidia.com/default/topic/1047965/cuda-programming-and-performance/host-device-latencies-/post/5318041/#5318041 host-device latencies?] by [[Srdja Matovic]], Nvidia CUDA ZONE, Feb 28, 2019</ref> up to 100s of microseconds <ref>[https://community.amd.com/thread/237337#comment-2902071 host-device latencies?] by [[Srdja Matovic]] AMD Developer Community, Feb 28, 2019</ref>. One solution to overcome this limitation is to couple tasks to batches to be executed in one run <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347#p761239 Re: GPU ANN, how to deal with host-device latencies?] by [[Milos Stanisavljevic]], [[CCC]], May 06, 2018</ref>.<br />
<br />
=Deep Learning=<br />
GPUs are much more suited than CPUs to implement and train [[Neural Networks#Convolutional|Convolutional Neural Networks]] (CNN), and were therefore also responsible for the [[Deep Learning|deep learning]] boom, also affecting game playing programs combining CNN with [[Monte-Carlo Tree Search|MCTS]], as pioneered by [[Google]] [[DeepMind|DeepMind's]] [[AlphaGo]] and [[AlphaZero]] entities in [[Go]], [[Shogi]] and [[Chess]] using [https://en.wikipedia.org/wiki/Tensor_processing_unit TPUs], and the open source projects [[Leela Zero]] headed by [[Gian-Carlo Pascutto]] for [[Go]] and its [[Leela Chess Zero]] adaption.<br />
<br />
= Architectures =<br />
The market is split into two categories, integrated and discrete GPUs. The first being the most important by quantity, the second by performance. Discrete GPUs are divided as consumer brands for playing 3D games, professional brands for CAD/CGI programs and server brands for big-data and number-crunching workloads. Each brand offering different feature sets in driver, VRAM, or computation abilities.<br />
<br />
== AMD ==<br />
AMD line of discrete GPUs is branded as Radeon for consumer, Radeon Pro for professional and Radeon Instinct for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units on Wikipedia] <br />
<br />
=== CDNA3 === <br />
CDNA3 HPC architecture was unveiled in December, 2023. With MI300A APU model (CPU+GPU+HBM) and MI300X GPU model, both with multi-chip modules design. Featuring Matrix Cores with support for a broad type of precision, as INT8, FP8, BF16, FP16, TF32, FP32, FP64, as well as sparse matrix data (sparsity). Supported by AMD's ROCm open software stack for AMD Instinct accelerators. <br />
<br />
* [https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf AMD CDNA3 Whitepaper]<br />
* [https://www.amd.com/en/developer/resources/rocm-hub.html AMD ROCm Developer Hub]<br />
<br />
=== Navi 3x RDNA3 === <br />
RDNA3 architecture in Radeon RX 7000 series was announced on November 3, 2022, featuring dedicated AI tensor operation acceleration.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_7000_series AMD Radeon RX 7000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA3_Shader_ISA_December2022.pdf RDNA3 Instruction Set Architecture]<br />
<br />
=== CDNA2 === <br />
CDNA2 architecture in MI200 HPC-GPU with optimized FP64 throughput (matrix and vector), multi-chip-module design and Infinity Fabric was unveiled in November, 2021.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna2-white-paper.pdf AMD CDNA2 Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA2_Shader_ISA_4February2022.pdf CDNA2 Instruction Set Architecture]<br />
<br />
=== CDNA === <br />
CDNA architecture in MI100 HPC-GPU with Matrix Cores was unveiled in November, 2020.<br />
<br />
* [https://www.amd.com/system/files/documents/amd-cdna-whitepaper.pdf AMD CDNA Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/CDNA1_Shader_ISA_14December2020.pdf CDNA Instruction Set Architecture]<br />
<br />
=== Navi 2x RDNA2 === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture)#RDNA_2 RDNA2] cards were unveiled on October 28, 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_RX_6000_series AMD Radeon RX 6000 on Wikipedia]<br />
* [https://developer.amd.com/wp-content/resources/RDNA2_Shader_ISA_November2020.pdf RDNA 2 Instruction Set Architecture]<br />
<br />
=== Navi RDNA === <br />
[https://en.wikipedia.org/wiki/RDNA_(microarchitecture) RDNA] cards were unveiled on July 7, 2019.<br />
<br />
* [https://www.amd.com/system/files/documents/rdna-whitepaper.pdf RDNA Whitepaper]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf Architecture Slide Deck]<br />
* [https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Shader_ISA_5August2019.pdf RDNA Instruction Set]<br />
<br />
=== Vega GCN 5th gen ===<br />
<br />
[https://en.wikipedia.org/wiki/Radeon_RX_Vega_series Vega] cards were unveiled on August 14, 2017.<br />
<br />
* [https://www.techpowerup.com/gpu-specs/docs/amd-vega-architecture.pdf Architecture Whitepaper]<br />
* [https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf Vega Instruction Set]<br />
<br />
=== Polaris GCN 4th gen === <br />
<br />
[https://en.wikipedia.org/wiki/Graphics_Core_Next#Graphics_Core_Next_4 Polaris] cards were first released in 2016.<br />
<br />
* [https://www.amd.com/system/files/documents/polaris-whitepaper.pdf Architecture Whitepaper]<br />
<br />
=== Southern Islands GCN 1st gen ===<br />
<br />
Southern Island cards introduced the [https://en.wikipedia.org/wiki/Graphics_Core_Next GCN] architecture in 2012.<br />
<br />
* [https://en.wikipedia.org/wiki/Radeon_HD_7000_series AMD Radeon HD 7000 on Wikipedia]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/10/si_programming_guide_v2.pdf Southern Islands Programming Guide]<br />
* [https://amd.wpenginepowered.com/wordpress/media/2013/07/AMD_Southern_Islands_Instruction_Set_Architecture1.pdf Southern Islands Instruction Set Architecture]<br />
<br />
== Apple ==<br />
<br />
=== M series ===<br />
<br />
Apple released its M series SoC (system on a chip) with integrated GPU for desktops and notebooks in 2020.<br />
<br />
* [https://en.wikipedia.org/wiki/Apple_silicon#M_series Apple M series on Wikipedia]<br />
<br />
== ARM ==<br />
The ARM Mali GPU variants can be found on various systems on chips (SoCs) from different vendors. Since Midgard (2012) with unified-shader-model OpenCL support is offered.<br />
<br />
* [https://en.wikipedia.org/wiki/Mali_(GPU)#Variants Mali variants on Wikipedia]<br />
<br />
=== Valhall (2019) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Bifrost (2016) ===<br />
<br />
* [https://developer.arm.com/documentation/101574/latest Bifrost and Valhall OpenCL Developer Guide]<br />
<br />
=== Midgard (2012) ===<br />
* [https://developer.arm.com/documentation/100614/latest Midgard OpenCL Developer Guide]<br />
<br />
== Intel ==<br />
<br />
=== Xe ===<br />
<br />
[https://en.wikipedia.org/wiki/Intel_Xe Intel Xe] line of GPUs (released since 2020) is divided as Xe-LP (low-power), Xe-HPG (high-performance-gaming), Xe-HP (high-performace) and Xe-HPC (high-performance-computing).<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units#Gen12 List of Intel Gen12 GPUs on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Intel_Arc#Alchemist Arc Alchemist series on Wikipedia]<br />
<br />
==Nvidia==<br />
Nvidia line of discrete GPUs is branded as GeForce for consumer, Quadro for professional and Tesla for server.<br />
<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units on Wikipedia]<br />
<br />
=== Ada Lovelace Architecture ===<br />
<br />
The [https://en.wikipedia.org/wiki/Ada_Lovelace_(microarchitecture) Ada Lovelace microarchitecture] was announced on September 20, 2022, featuring 4th-generation Tensor Cores with FP8, FP16, BF16, TF32 and sparsity acceleration.<br />
<br />
* [https://images.nvidia.com/aem-dam/Solutions/geforce/ada/nvidia-ada-gpu-architecture.pdf Ada GPU Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ada-tuning-guide/index.html Ada Tuning Guide]<br />
<br />
=== Hopper Architecture ===<br />
The [https://en.wikipedia.org/wiki/Hopper_(microarchitecture) Hopper GPU Datacenter microarchitecture] was announced on March 22, 2022, featuring Transfomer Engines for large language models.<br />
<br />
* [https://resources.nvidia.com/en-us-tensor-core Hopper H100 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/hopper-tuning-guide/index.html Hopper Tuning Guide]<br />
<br />
=== Ampere Architecture ===<br />
The [https://en.wikipedia.org/wiki/Ampere_(microarchitecture) Ampere microarchitecture] was announced on May 14, 2020 <ref>[https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ NVIDIA Ampere Architecture In-Depth | NVIDIA Developer Blog] by [https://people.csail.mit.edu/ronny/ Ronny Krashinsky], [https://cppcast.com/guest/ogiroux/ Olivier Giroux], [https://blogs.nvidia.com/blog/author/stephenjones/ Stephen Jones], [https://blogs.nvidia.com/blog/author/nick-stam/ Nick Stam] and [https://en.wikipedia.org/wiki/Sridhar_Ramaswamy Sridhar Ramaswamy], May 14, 2020</ref>. The Nvidia A100 GPU based on the Ampere architecture delivers a generational leap in accelerated computing in conjunction with CUDA 11 <ref>[https://devblogs.nvidia.com/cuda-11-features-revealed/ CUDA 11 Features Revealed | NVIDIA Developer Blog] by [https://devblogs.nvidia.com/author/pramarao/ Pramod Ramarao], May 14, 2020</ref>.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf Ampere GA100 Whitepaper]<br />
* [https://www.nvidia.com/content/PDF/nvidia-ampere-ga-102-gpu-architecture-whitepaper-v2.pdf Ampere GA102 Whitepaper]<br />
* [https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html Ampere GPU Architecture Tuning Guide]<br />
<br />
=== Turing Architecture ===<br />
[https://en.wikipedia.org/wiki/Turing_(microarchitecture) Turing] cards were first released in 2018. They are the first consumer cores to launch with RTX, for [https://en.wikipedia.org/wiki/Ray_tracing_(graphics) raytracing], features. These are also the first consumer cards to launch with TensorCores used for matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]]. The Turing GTX line of chips do not offer RTX or TensorCores.<br />
<br />
* [https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf Turing Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/turing-tuning-guide/index.html Turing Tuning Guide]<br />
<br />
=== Volta Architecture === <br />
[https://en.wikipedia.org/wiki/Volta_(microarchitecture) Volta] cards were released in 2017. They were the first cards to launch with TensorCores, supporting matrix multiplications to accelerate [[Neural Networks#Convolutional|convolutional neural networks]].<br />
<br />
* [https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf Volta Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/volta-tuning-guide/index.html Volta Tuning Guide]<br />
<br />
=== Pascal Architecture ===<br />
[https://en.wikipedia.org/wiki/Pascal_(microarchitecture) Pascal] cards were first released in 2016.<br />
<br />
* [https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf Pascal Architecture Whitepaper]<br />
* [https://docs.nvidia.com/cuda/pascal-tuning-guide/index.html Pascal Tuning Guide]<br />
<br />
=== Maxwell Architecture ===<br />
[https://en.wikipedia.org/wiki/Maxwell(microarchitecture) Maxwell] cards were first released in 2014.<br />
<br />
* [https://web.archive.org/web/20170721113746/http://international.download.nvidia.com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF Maxwell Architecture Whitepaper on archiv.org]<br />
* [https://docs.nvidia.com/cuda/maxwell-tuning-guide/index.html Maxwell Tuning Guide]<br />
<br />
== PowerVR ==<br />
PowerVR (Imagination Technologies) licenses IP to third parties (most notable Apple) used for system on a chip (SoC) designs. Since Series5 SGX OpenCL support via licensees is available.<br />
<br />
=== PowerVR ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#PowerVR_Graphics PowerVR series on Wikipedia]<br />
<br />
=== IMG ===<br />
<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_A-Series_(Albiorix) IMG A series on Wikipedia]<br />
* [https://en.wikipedia.org/wiki/PowerVR#IMG_B-Series IMG B series on Wikipedia]<br />
<br />
== Qualcomm ==<br />
Qualcomm offers Adreno GPUs in various types as a component of their Snapdragon SoCs. Since Adreno 300 series OpenCL support is offered.<br />
<br />
=== Adreno ===<br />
* [https://en.wikipedia.org/wiki/Adreno#Variants Adreno variants on Wikipedia]<br />
<br />
== Vivante Corporation ==<br />
Vivante licenses IP to third parties for embedded systems, the GC series offers optional OpenCL support.<br />
<br />
=== GC-Series ===<br />
<br />
* [https://en.wikipedia.org/wiki/Vivante_Corporation#Products GC series on Wikipedia]<br />
<br />
=See also= <br />
* [[Deep Learning]]<br />
* [[FPGA]]<br />
* [[Graphics Programming]]<br />
* [[Monte-Carlo Tree Search]]<br />
** [[MCαβ]]<br />
** [[UCT]]<br />
* [[Parallel Search]]<br />
* [[Perft#15|Perft(15)]] <br />
* [[SIMD and SWAR Techniques]]<br />
* [[Thread]]<br />
<br />
=Publications= <br />
<br />
==1986== <br />
* [[Mathematician#Hillis|W. Daniel Hillis]], [[Mathematician#GSteele|Guy L. Steele, Jr.]] ('''1986'''). ''[https://dl.acm.org/citation.cfm?id=7903 Data parallel algorithms]''. [[ACM#Communications|Communications of the ACM]], Vol. 29, No. 12, Special Issue on Parallelism<br />
==1990==<br />
* [[Mathematician#GEBlelloch|Guy E. Blelloch]] ('''1990'''). ''[https://dl.acm.org/citation.cfm?id=91254 Vector Models for Data-Parallel Computing]''. [https://en.wikipedia.org/wiki/MIT_Press MIT Press], [https://www.cs.cmu.edu/~guyb/papers/Ble90.pdf pdf]<br />
==2008 ...==<br />
* [[Vlad Stamate]] ('''2008'''). ''Real Time Photon Mapping Approximation on the GPU''. in [http://shaderx6.com/TOC.html ShaderX6 - Advanced Rendering Techniques] <ref>[https://en.wikipedia.org/wiki/Photon_mapping Photon mapping from Wikipedia]</ref><br />
* [[Ren Wu]], [http://www.cedar.buffalo.edu/~binzhang/ Bin Zhang], [http://www.hpl.hp.com/people/meichun_hsu/ Meichun Hsu] ('''2009'''). ''[http://portal.acm.org/citation.cfm?id=1531668 Clustering billions of data points using GPUs]''. [http://www.computingfrontiers.org/2009/ ACM International Conference on Computing Frontiers]<br />
* [https://github.com/markgovett Mark Govett], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2009'''). ''Using Graphical Processing Units (GPUs) for Next Generation Weather and Climate Prediction Models''. [http://www.cisl.ucar.edu/dir/CAS2K9/ CAS2K9 Workshop]<br />
* [[Hank Dietz]], [https://dblp.uni-trier.de/pers/hd/y/Young:Bobby_Dalton Bobby Dalton Young] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-13374-9_5 MIMD Interpretation on a GPU]''. [https://dblp.uni-trier.de/db/conf/lcpc/lcpc2009.html LCPC 2009], [http://aggregate.ee.engr.uky.edu/EXHIBITS/SC09/mogsimlcpc09final.pdf pdf], [http://aggregate.org/GPUMC/mogsimlcpc09slides.pdf slides.pdf]<br />
* [https://dblp.uni-trier.de/pid/28/7183.html Sander van der Maar], [[Joost Batenburg]], [https://scholar.google.com/citations?user=TtXZhj8AAAAJ&hl=en Jan Sijbers] ('''2009'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-03138-0_33 Experiences with Cell-BE and GPU for Tomography]''. [https://dblp.uni-trier.de/db/conf/samos/samos2009.html#MaarBS09 SAMOS 2009] <ref>[https://en.wikipedia.org/wiki/Cell_(microprocessor) Cell (microprocessor) from Wikipedia]</ref><br />
==2010...==<br />
* [https://www.linkedin.com/in/avi-bleiweiss-456a5644 Avi Bleiweiss] ('''2010'''). ''Playing Zero-Sum Games on the GPU''. [https://en.wikipedia.org/wiki/Nvidia NVIDIA Corporation], [http://www.nvidia.com/object/io_1269574709099.html GPU Technology Conference 2010], [http://www.nvidia.com/content/gtc-2010/pdfs/2207_gtc2010.pdf slides as pdf]<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson] ('''2010'''). ''[https://dl.acm.org/citation.cfm?id=1845128 Running the NIM Next-Generation Weather Model on GPUs]''. [https://dblp.uni-trier.de/db/conf/ccgrid/ccgrid2010.html CCGRID 2010]<br />
* John Nickolls, William J. Dally ('''2010'''). [https://ieeexplore.ieee.org/document/5446251 The GPU Computing Era]. [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=40 IEEE Micro].<br />
'''2011'''<br />
* [https://github.com/markgovett Mark Govett], [[Jacques Middlecoff]], [https://www.researchgate.net/profile/Tom_Henderson4 Tom Henderson], [https://cug.org/5-publications/proceedings_attendee_lists/CUG09CD/S09_Proceedings/pages/authors/11-15Wednesday/12A-Rosinski/Rosinski-paper.html Jim Rosinski], [https://www.linkedin.com/in/craig-tierney-9568545 Craig Tierney] ('''2011'''). ''Parallelization of the NIM Dynamical Core for GPUs''. [https://is.enes.org/archive-1/archive/documents/Govett.pdf slides as pdf]<br />
* [[Ľubomír Lackovič]] ('''2011'''). ''[https://hgpu.org/?p=5772 Parallel Game Tree Search Using GPU]''. Institute of Informatics and Software Engineering, [https://en.wikipedia.org/wiki/Faculty_of_Informatics_and_Information_Technologies Faculty of Informatics and Information Technologies], [https://en.wikipedia.org/wiki/Slovak_University_of_Technology_in_Bratislava Slovak University of Technology in Bratislava], [http://acmbulletin.fiit.stuba.sk/vol3num2/lackovic.pdf pdf]<br />
* [[Dan Anthony Feliciano Alcantara]] ('''2011'''). ''Efficient Hash Tables on the GPU''. Ph. D. thesis, [https://en.wikipedia.org/wiki/University_of_California,_Davis University of California, Davis], [http://idav.ucdavis.edu/~dfalcant//downloads/dissertation.pdf pdf] » [[Hash Table]]<br />
* [[Damian Sulewski]] ('''2011'''). ''Large-Scale Parallel State Space Search Utilizing Graphics Processing Units and Solid State Disks''. Ph.D. thesis, [[University of Dortmund]], [https://eldorado.tu-dortmund.de/dspace/bitstream/2003/29418/1/Dissertation.pdf pdf]<br />
* [[Damjan Strnad]], [[Nikola Guid]] ('''2011'''). ''[http://cit.fer.hr/index.php/CIT/article/view/2029 Parallel Alpha-Beta Algorithm on the GPU]''. [http://cit.fer.hr/index.php/CIT CIT. Journal of Computing and Information Technology], Vol. 19, No. 4 » [[Parallel Search]], [[Othello|Reversi]] <br />
* [[Balázs Jako|Balázs Jákó]] ('''2011'''). ''Fast Hydraulic and Thermal Erosion on GPU''. M.Sc. thesis, Supervisor [https://hu.linkedin.com/in/bal%C3%A1zs-t%C3%B3th-1b151329 Balázs Tóth], [http://eg2011.bangor.ac.uk/ Eurographics 2011], [http://old.cescg.org/CESCG-2011/papers/TUBudapest-Jako-Balazs.pdf pdf]<br />
'''2012'''<br />
* [[Liang Li]], [[Hong Liu]], [[Peiyu Liu]], [[Taoying Liu]], [[Wei Li]], [[Hao Wang]] ('''2012'''). ''[https://www.semanticscholar.org/paper/A-Node-based-Parallel-Game-Tree-Algorithm-Using-Li-Liu/be21d7b9b91957b700aab4ce002e6753b826ff54 A Node-based Parallel Game Tree Algorithm Using GPUs]''. CLUSTER 2012 » [[Parallel Search]]<br />
'''2013'''<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://scholar.google.de/citations?view_op=view_citation&hl=en&user=VvkRESgAAAAJ&citation_for_view=VvkRESgAAAAJ:ufrVoPGSRksC A parallel memetic algorithm on GPU to solve the task scheduling problem in heterogeneous environments]''. [http://www.sigevo.org/gecco-2013/program.html GECCO '13]<br />
* [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami], [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2013'''). ''[https://ieeexplore.ieee.org/document/6714232 A statistical performance prediction model for OpenCL kernels on NVIDIA GPUs]''. [https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6708586 CADS 2013]<br />
* [[Diego Rodríguez-Losada]], [[Pablo San Segundo]], [[Miguel Hernando]], [https://dblp.uni-trier.de/pers/hd/p/Puente:Paloma_de_la Paloma de la Puente], [https://dblp.uni-trier.de/pers/hd/v/Valero=Gomez:Alberto Alberto Valero-Gomez] ('''2013'''). ''GPU-Mapping: Robotic Map Building with Graphical Multiprocessors''. [https://dblp.uni-trier.de/db/journals/ram/ram20.html IEEE Robotics & Automation Magazine, Vol. 20, No. 2], [https://www.acin.tuwien.ac.at/fileadmin/acin/v4r/v4r/GPUMap_RAM2013.pdf pdf]<br />
* [https://dblp.org/pid/28/977-2.html David Williams], [[Valeriu Codreanu]], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://dblp.org/pid/54/784.html Baoquan Liu], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [https://dblp.org/pid/136/5430.html Burhan Yasar], [https://scholar.google.com/citations?user=FZVGYiQAAAAJ&hl=en Babak Mahdian], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini], [https://zhaoxiahust.github.io/ Xia Zhao], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink] ('''2013'''). ''[https://link.springer.com/chapter/10.1007/978-3-642-55224-3_42 Evaluation of Autoparallelization Toolkits for Commodity GPUs]''. [https://dblp.org/db/conf/ppam/ppam2013-1.html#WilliamsCYLDYMCZR13 PPAM 2013]<br />
'''2014'''<br />
* [https://dblp.uni-trier.de/pers/hd/d/Dang:Qingqing Qingqing Dang], [https://dblp.uni-trier.de/pers/hd/y/Yan:Shengen Shengen Yan], [[Ren Wu]] ('''2014'''). ''[https://ieeexplore.ieee.org/document/7097862 A fast integral image generation algorithm on GPUs]''. [https://dblp.uni-trier.de/db/conf/icpads/icpads2014.html ICPADS 2014]<br />
* [[S. Ali Mirsoleimani]], [https://dblp.uni-trier.de/pers/hd/k/Karami:Ali Ali Karami Ali Karami], [https://dblp.uni-trier.de/pers/hd/k/Khunjush:Farshad Farshad Khunjush] ('''2014'''). ''[https://link.springer.com/chapter/10.1007/978-3-319-04891-8_12 A Two-Tier Design Space Exploration Algorithm to Construct a GPU Performance Predictor]''. [https://dblp.uni-trier.de/db/conf/arcs/arcs2014.html ARCS 2014], [https://en.wikipedia.org/wiki/Lecture_Notes_in_Computer_Science Lecture Notes in Computer Science], Vol. 8350, [https://en.wikipedia.org/wiki/Springer_Science%2BBusiness_Media Springer]<br />
* [[Steinar H. Gunderson]] ('''2014'''). ''[https://archive.fosdem.org/2014/schedule/event/movit/ Movit: High-speed, high-quality video filters on the GPU]''. [https://en.wikipedia.org/wiki/FOSDEM FOSDEM] [https://archive.fosdem.org/2014/ 2014], [https://movit.sesse.net/movit-fosdem2014.pdf pdf]<br />
* [https://dblp.org/pid/54/784.html Baoquan Liu], [https://scholar.google.com/citations?user=VspO6ZUAAAAJ&hl=en Alexandru Telea], [https://scholar.google.com/citations?user=jCFYHlkAAAAJ&hl=en Jos Roerdink], [https://dblp.org/pid/87/6797.html Gordon Clapworthy], [https://dblp.org/pid/28/977-2.html David Williams], [https://dblp.org/pid/88/5343-1.html Po Yang], [https://www.strath.ac.uk/staff/dongfengprofessor/ Feng Dong], [[Valeriu Codreanu]], [https://scholar.google.com/citations?user=8WO6cVUAAAAJ&hl=en Alessandro Chiarini] ('''2018'''). ''Parallel centerline extraction on the GPU''. [https://www.journals.elsevier.com/computers-and-graphics Computers & Graphics], Vol. 41, [https://strathprints.strath.ac.uk/70614/1/Liu_etal_CG2014_Parallel_centerline_extraction_GPU.pdf pdf]<br />
==2015 ...==<br />
* [[Peter H. Jin]], [[Kurt Keutzer]] ('''2015'''). ''Convolutional Monte Carlo Rollouts in Go''. [http://arxiv.org/abs/1512.03375 arXiv:1512.03375] » [[Deep Learning]], [[Go]], [[Monte-Carlo Tree Search|MCTS]]<br />
* [[Liang Li]], [[Hong Liu]], [[Hao Wang]], [[Taoying Liu]], [[Wei Li]] ('''2015'''). ''[https://ieeexplore.ieee.org/document/6868996 A Parallel Algorithm for Game Tree Search Using GPGPU]''. [[IEEE#TPDS|IEEE Transactions on Parallel and Distributed Systems]], Vol. 26, No. 8 » [[Parallel Search]]<br />
* [[Simon Portegies Zwart]], [https://github.com/jbedorf Jeroen Bédorf] ('''2015'''). ''[https://www.computer.org/csdl/magazine/co/2015/11/mco2015110050/13rRUx0Pqwe Using GPUs to Enable Simulation with Computational Gravitational Dynamics in Astrophysics]''. [[IEEE #Computer|IEEE Computer]], Vol. 48, No. 11<br />
'''2016'''<br />
* <span id="Astro"></span>[https://www.linkedin.com/in/sean-sheen-b99aba89 Sean Sheen] ('''2016'''). ''[https://digitalcommons.calpoly.edu/theses/1567/ Astro - A Low-Cost, Low-Power Cluster for CPU-GPU Hybrid Computing using the Jetson TK1]''. Master's thesis, [https://en.wikipedia.org/wiki/California_Polytechnic_State_University California Polytechnic State University], [https://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=2723&context=theses pdf] <ref>[http://www.nvidia.com/object/jetson-tk1-embedded-dev-kit.html Jetson TK1 Embedded Development Kit | NVIDIA]</ref> <ref>[http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016</ref><br />
* [https://scholar.google.com/citations?user=YyD7mwcAAAAJ&hl=en Jingyue Wu], [https://scholar.google.com/citations?user=EJcIByYAAAAJ&hl=en Artem Belevich], [https://scholar.google.com/citations?user=X5WAGdEAAAAJ&hl=en Eli Bendersky], [https://www.linkedin.com/in/mark-heffernan-873b663/ Mark Heffernan], [https://scholar.google.com/citations?user=Guehv9sAAAAJ&hl=en Chris Leary], [https://scholar.google.com/citations?user=fAmfZAYAAAAJ&hl=en Jacques Pienaar], [http://www.broune.com/ Bjarke Roune], [https://scholar.google.com/citations?user=Der7mNMAAAAJ&hl=en Rob Springer], [https://scholar.google.com/citations?user=zvfOH0wAAAAJ&hl=en Xuetian Weng], [https://scholar.google.com/citations?user=s7VCtl8AAAAJ&hl=en Robert Hundt] ('''2016'''). ''[https://dl.acm.org/citation.cfm?id=2854041 gpucc: an open-source GPGPU compiler]''. [https://cgo.org/cgo2016/ CGO 2016]<br />
* [[David Silver]], [[Shih-Chieh Huang|Aja Huang]], [[Chris J. Maddison]], [[Arthur Guez]], [[Laurent Sifre]], [[George van den Driessche]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Veda Panneershelvam]], [[Marc Lanctot]], [[Sander Dieleman]], [[Dominik Grewe]], [[John Nham]], [[Nal Kalchbrenner]], [[Ilya Sutskever]], [[Timothy Lillicrap]], [[Madeleine Leach]], [[Koray Kavukcuoglu]], [[Thore Graepel]], [[Demis Hassabis]] ('''2016'''). ''[http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html Mastering the game of Go with deep neural networks and tree search]''. [https://en.wikipedia.org/wiki/Nature_%28journal%29 Nature], Vol. 529 » [[AlphaGo]]<br />
* [[Balázs Jako|Balázs Jákó]] ('''2016'''). ''[https://www.semanticscholar.org/paper/Hardware-accelerated-hybrid-rendering-on-PowerVR-J%C3%A1k%C3%B3/d9d7f5784263c5abdcd6c1bf93267e334468b9b2 Hardware accelerated hybrid rendering on PowerVR GPUs]''. <ref>[https://en.wikipedia.org/wiki/PowerVR PowerVR from Wikipedia]</ref> [[IEEE]] [https://ieeexplore.ieee.org/xpl/conhome/7547434/proceeding 20th Jubilee International Conference on Intelligent Engineering Systems]<br />
* [[Diogo R. Ferreira]], [https://dblp.uni-trier.de/pers/hd/s/Santos:Rui_M= Rui M. Santos] ('''2016'''). ''[https://github.com/diogoff/transition-counting-gpu Parallelization of Transition Counting for Process Mining on Multi-core CPUs and GPUs]''. [https://dblp.uni-trier.de/db/conf/bpm/bpmw2016.html BPM 2016]<br />
* [https://dblp.org/pers/hd/s/Sch=uuml=tt:Ole Ole Schütt], [https://developer.nvidia.com/blog/author/peter-messmer/ Peter Messmer], [https://scholar.google.ch/citations?user=ajbBWN0AAAAJ&hl=en Jürg Hutter], [[Joost VandeVondele]] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/10.1002/9781118670712.ch8 GPU Accelerated Sparse Matrix–Matrix Multiplication for Linear Scaling Density Functional Theory]''. [https://www.cp2k.org/_media/gpu_book_chapter_submitted.pdf pdf] <ref>[https://en.wikipedia.org/wiki/Density_functional_theory Density functional theory from Wikipedia]</ref><br />
: Chapter 8 in [https://scholar.google.com/citations?user=AV307ZUAAAAJ&hl=en Ross C. Walker], [https://scholar.google.com/citations?user=PJusscIAAAAJ&hl=en Andreas W. Götz] ('''2016'''). ''[https://onlinelibrary.wiley.com/doi/book/10.1002/9781118670712 Electronic Structure Calculations on Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics]''. [https://en.wikipedia.org/wiki/Wiley_(publisher) John Wiley & Sons]<br />
'''2017'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2017'''). ''Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm''. [https://arxiv.org/abs/1712.01815 arXiv:1712.01815] » [[AlphaZero]]<br />
* [[Tristan Cazenave]] ('''2017'''). ''[http://ieeexplore.ieee.org/document/7875402/ Residual Networks for Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. PP, No. 99, [http://www.lamsade.dauphine.fr/~cazenave/papers/resnet.pdf pdf]<br />
* [https://scholar.google.com/citations?user=zLksndkAAAAJ&hl=en Jayvant Anantpur], [https://dblp.org/pid/09/10702.html Nagendra Gulur Dwarakanath], [https://dblp.org/pid/16/4410.html Shivaram Kalyanakrishnan], [[Shalabh Bhatnagar]], [https://dblp.org/pid/45/3592.html R. Govindarajan] ('''2017'''). ''RLWS: A Reinforcement Learning based GPU Warp Scheduler''. [https://arxiv.org/abs/1712.04303 arXiv:1712.04303]<br />
'''2018'''<br />
* [[David Silver]], [[Thomas Hubert]], [[Julian Schrittwieser]], [[Ioannis Antonoglou]], [[Matthew Lai]], [[Arthur Guez]], [[Marc Lanctot]], [[Laurent Sifre]], [[Dharshan Kumaran]], [[Thore Graepel]], [[Timothy Lillicrap]], [[Karen Simonyan]], [[Demis Hassabis]] ('''2018'''). ''[http://science.sciencemag.org/content/362/6419/1140 A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play]''. [https://en.wikipedia.org/wiki/Science_(journal) Science], Vol. 362, No. 6419<br />
<br />
=Forum Posts= <br />
==2005 ...==<br />
* [http://www.open-aurec.com/wbforum/viewtopic.php?f=4&t=5480 Hardware assist] by [[Nicolai Czempin]], [[Computer Chess Forums|Winboard Forum]], August 27, 2006<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22732 Monte carlo on a NVIDIA GPU ?] by [[Marco Costalba]], [[CCC]], August 01, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32750 Using the GPU] by [[Louis Zulli]], [[CCC]], February 19, 2010<br />
'''2011'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38002 GPGPU and computer chess] by Wim Sjoho, [[CCC]], February 09, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=38478 Possible Board Presentation and Move Generation for GPUs?] by [[Srdja Matovic]], [[CCC]], March 19, 2011<br />
: [http://www.talkchess.com/forum/viewtopic.php?t=38478&start=8 Re: Possible Board Presentation and Move Generation for GPUs] by [[Steffan Westcott]], [[CCC]], March 20, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39459 Zeta plays chess on a gpu] by [[Srdja Matovic]], [[CCC]], June 23, 2011 » [[Zeta]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=39606 GPU Search Methods] by [[Joshua Haglund]], [[CCC]], July 04, 2011<br />
'''2012'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?topic_view=threads&p=442052&t=41853 Possible Search Algorithms for GPUs?] by [[Srdja Matovic]], [[CCC]], January 07, 2012 <ref>[[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2</ref> <ref>[[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=42590 uct on gpu] by [[Daniel Shawul]], [[CCC]], February 24, 2012 » [[UCT]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=43971 Is there such a thing as branchless move generation?] by [[John Hamlen]], [[CCC]], June 07, 2012 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44014 Choosing a GPU platform: AMD and Nvidia] by [[John Hamlen]], [[CCC]], June 10, 2012<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46277 Nvidias K20 with Recursion] by [[Srdja Matovic]], [[CCC]], December 04, 2012 <ref>[http://www.techpowerup.com/173846/Tesla-K20-GPU-Compute-Processor-Specifications-Released.html Tesla K20 GPU Compute Processor Specifications Released | techPowerUp]</ref><br />
'''2013'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=46974 Kogge Stone, Vector Based] by [[Srdja Matovic]], [[CCC]], January 22, 2013 » [[Kogge-Stone Algorithm]] <ref>[https://en.wikipedia.org/wiki/Parallel_Thread_Execution Parallel Thread Execution from Wikipedia]</ref> <ref>NVIDIA Compute PTX: Parallel Thread Execution, ISA Version 1.4, March 31, 2009, [http://www.nvidia.com/content/CUDA-ptx_isa_1.4.pdf pdf]</ref><br />
* [http://www.talkchess.com/forum/viewtopic.php?t=47344 GPU chess engine] by Samuel Siltanen, [[CCC]], February 27, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013 » [[Perft]], [[Kogge-Stone Algorithm]] <ref>[https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub]</ref><br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=60386 GPU chess update, local memory...] by [[Srdja Matovic]], [[CCC]], June 06, 2016<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61761 Jetson GPU architecture] by [[Dann Corbit]], [[CCC]], October 18, 2016 » [[GPU#Astro|Astro]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=61925 Pigeon is now running on the GPU] by [[Stuart Riffle]], [[CCC]], November 02, 2016 » [[Pigeon]]<br />
'''2017'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=63346 Back to the basics, generating moves on gpu in parallel...] by [[Srdja Matovic]], [[CCC]], March 05, 2017 » [[Move Generation]]<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=64983&start=9 Re: Perft(15): comparison of estimates with Ankan's result] by [[Ankan Banerjee]], [[CCC]], August 26, 2017 » [[Perft#15|Perft(15)]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=32317 Chess Engine and GPU] by Fishpov , [[Computer Chess Forums|Rybka Forum]], October 09, 2017 <br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66025 To TPU or not to TPU...] by [[Srdja Matovic]], [[CCC]], December 16, 2017 » [[Deep Learning]] <ref>[https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]</ref><br />
'''2018'''<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=66280 Announcing lczero] by [[Gary Linscott|Gary]], [[CCC]], January 09, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=67347 GPU ANN, how to deal with host-device latencies?] by [[Srdja Matovic]], [[CCC]], May 06, 2018 » [[Neural Networks]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=67357 GPU contention] by [[Ian Kennedy]], [[CCC]], May 07, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448 How good is the RTX 2080 Ti for Leela?] by Hai, September 15, 2018 » [[Leela Chess Zero]] <ref>[https://en.wikipedia.org/wiki/GeForce_20_series GeForce 20 series from Wikipedia]</ref><br />
: [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68448&start=2 Re: How good is the RTX 2080 Ti for Leela?] by [[Ankan Banerjee]], [[CCC]], September 16, 2018<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=68973 My non-OC RTX 2070 is very fast with Lc0] by [[Kai Laskos]], [[CCC]], November 19, 2018 » [[Leela Chess Zero]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69400 LC0 using 4 x 2080 Ti GPU's on Chess.com tourney?] by M. Ansari, [[CCC]], December 28, 2018 » [[Leela Chess Zero]]<br />
'''2019'''<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447 Generate EGTB with graphics cards?] by [[Pham Hong Nguyen|Nguyen Pham]], [[CCC]], January 01, 2019 » [[Endgame Tablebases]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=69478 LCZero FAQ is missing one important fact] by [[Jouni Uski]], [[CCC]], January 01, 2019 » [[Leela Chess Zero]]<br />
* [https://groups.google.com/d/msg/lczero/I0lTgR-fFFU/NGC3kJDzAwAJ Michael Larabel benches lc0 on various GPUs] by [[Warren D. Smith]], [[Computer Chess Forums|LCZero Forum]], January 14, 2019 » [[Leela Chess Zero#Lc0|Lc0]] <ref>[https://en.wikipedia.org/wiki/Phoronix_Test_Suite Phoronix Test Suite from Wikipedia]</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=70362 Using LC0 with one or two GPUs - a guide] by [[Srdja Matovic]], [[CCC]], March 30, 2019 » [[Leela Chess Zero#Lc0|Lc0]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=70584 Wouldn't it be nice if C++ GPU] by [[Chris Whittington]], [[CCC]], April 25, 2019 » [[Cpp|C++]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71058 Lazy-evaluation of futures for parallel work-efficient Alpha-Beta search] by Percival Tiglao, [[CCC]], June 06, 2019<br />
* [https://www.game-ai-forum.org/viewtopic.php?f=21&t=694 My home-made CUDA kernel for convolutions] by [[Rémi Coulom]], [[Computer Chess Forums|Game-AI Forum]], November 09, 2019 » [[Deep Learning]]<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=72320 GPU rumors 2020] by [[Srdja Matovic]], [[CCC]], November 13, 2019<br />
==2020 ...==<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=74771 AB search with NN on GPU...] by [[Srdja Matovic]], [[CCC]], August 13, 2020 » [[Neural Networks]] <ref>[https://forums.developer.nvidia.com/t/kernel-launch-latency/62455 kernel launch latency - CUDA / CUDA Programming and Performance - NVIDIA Developer Forums] by LukeCuda, June 18, 2018</ref><br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75073 I stumbled upon this article on the new Nvidia RTX GPUs] by [[Kai Laskos]], [[CCC]], September 10, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=2&t=75639 Will AMD RDNA2 based Radeon RX 6000 series kick butt with Lc0?] by [[Srdja Matovic]], [[CCC]], November 01, 2020<br />
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=76986 Zeta with NNUE on GPU?] by [[Srdja Matovic]], [[CCC]], March 31, 2021 » [[Zeta]], [[NNUE]]<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=77097 GPU rumors 2021] by [[Srdja Matovic]], [[CCC]], April 16, 2021<br />
* [https://www.talkchess.com/forum3/viewtopic.php?f=7&t=79078 Comparison of all known Sliding lookup algorithms <nowiki>[CUDA]</nowiki>] by [[Daniel Infuehr]], [[CCC]], January 08, 2022 » [[Sliding Piece Attacks]]<br />
<br />
=External Links= <br />
* [https://en.wikipedia.org/wiki/Graphics_processing_unit Graphics processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Video_card Video card from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Heterogeneous_System_Architecture Heterogeneous System Architecture from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Tensor_processing_unit Tensor processing unit from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units General-purpose computing on graphics processing units (GPGPU) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units List of AMD graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Intel_graphics_processing_units List of Intel graphics processing units from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units List of Nvidia graphics processing units from Wikipedia]<br />
* [https://developer.nvidia.com/ NVIDIA Developer]<br />
* [https://developer.nvidia.com/nvidia-gpu-programming-guide NVIDIA GPU Programming Guide]<br />
==OpenCL==<br />
* [https://en.wikipedia.org/wiki/OpenCL OpenCL from Wikipedia]<br />
* [https://www.codeproject.com/Articles/110685/Part-1-OpenCL-Portable-Parallelism Part 1: OpenCL™ – Portable Parallelism - CodeProject]<br />
* [https://www.codeproject.com/Articles/122405/Part-2-OpenCL-Memory-Spaces Part 2: OpenCL™ – Memory Spaces - CodeProject]<br />
==CUDA==<br />
* [https://en.wikipedia.org/wiki/CUDA CUDA from Wikipedia]<br />
* [https://developer.nvidia.com/cuda-zone CUDA Zone | NVIDIA Developer]<br />
* [https://en.wikipedia.org/wiki/NVIDIA_CUDA_Compiler Nvidia CUDA Compiler (NVCC) from Wikipedia]<br />
* [https://llvm.org/docs/CompileCudaWithLLVM.html Compiling CUDA with clang] — [https://en.wikipedia.org/wiki/LLVM LLVM] [https://en.wikipedia.org/wiki/Clang Clang] documentation <br />
* [https://github.com/cppcon/cppcon2016 CppCon 2016]: “Bringing Clang and C++ to GPUs: An Open-Source, CUDA-Compatible GPU C++ Compiler" by [https://github.com/jlebar Justin Lebar], [https://en.wikipedia.org/wiki/YouTube YouTube] Video <ref>[http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69447&start=1 Re: Generate EGTB with graphics cards?] by [http://www.indriid.com/ Graham Jones], [[CCC]], January 01, 2019</ref><br />
: : {{#evu:https://www.youtube.com/watch?v=KHa-OSrZPGo|alignment=left|valignment=top}}<br />
==Deep Learning==<br />
* [https://developer.nvidia.com/deep-learning Deep Learning | NVIDIA Developer] » [[Deep Learning]]<br />
* [https://developer.nvidia.com/cudnn NVIDIA cuDNN | NVIDIA Developer]<br />
* [http://parse.ele.tue.nl/education/cluster2 Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster]<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/ Deep Learning in a Nutshell: Core Concepts] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], November 3, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/ Deep Learning in a Nutshell: History and Training] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], December 16, 2015<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-sequence-learning/ Deep Learning in a Nutshell: Sequence Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], March 7, 2016<br />
* [https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/ Deep Learning in a Nutshell: Reinforcement Learning] by [http://timdettmers.com/ Tim Dettmers], [https://devblogs.nvidia.com/parallelforall/ Parallel Forall], September 8, 2016<br />
* [https://blog.dominodatalab.com/gpu-computing-and-deep-learning/ Faster deep learning with GPUs and Theano] <br />
* [https://en.wikipedia.org/wiki/Theano_(software) Theano (software) from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/TensorFlow TensorFlow from Wikipedia]<br />
==Game Programming==<br />
* [http://andy-thomason.github.io/lecture_notes/agp/agp_gpgpu_programming.html Advanced game programming | Session 5 - GPGPU programming] by [[Andy Thomason]]<br />
* [https://zero.sjeng.org/ Leela Zero] by [[Gian-Carlo Pascutto]] » [[Leela Zero]]<br />
: [https://github.com/gcp/leela-zero GitHub - gcp/leela-zero: Go engine with no human-provided knowledge, modeled after the AlphaGo Zero paper]<br />
==Chess Programming==<br />
* [https://chessgpgpu.blogspot.com/ Chess on a GPGPU]<br />
* [http://gpuchess.blogspot.com/ GPU Chess Blog]<br />
* [https://github.com/ankan-ban/perft_gpu ankan-ban/perft_gpu · GitHub] » [[Perft]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=48387 Fast perft on GPU (upto 20 Billion nps w/o hashing)] by [[Ankan Banerjee]], [[CCC]], June 22, 2013</ref><br />
* [https://github.com/LeelaChessZero LCZero · GitHub] » [[Leela Chess Zero]]<br />
* [https://github.com/StuartRiffle/Jaglavak GitHub - StuartRiffle/Jaglavak: Corvid Chess Engine] » [[Jaglavak]]<br />
* [https://zeta-chess.app26.de/ Zeta OpenCL Chess] » [[Zeta]]<br />
<br />
=References= <br />
<references /><br />
'''[[Hardware|Up one Level]]'''<br />
[[Category:Videos]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Aquarium&diff=26881Aquarium2023-12-24T05:30:37Z<p>Smatovic: </p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Aquarium'''<br />
<br />
[[FILE:Amaterske akvarium.jpg|border|right|thumb| Aquarium with plants and tropical fish <ref>[https://en.wikipedia.org/wiki/Aquarium Aquarium from Wikipedia]</ref> ]] <br />
<br />
'''Aquarium''',<br/><br />
a sophisticated, commercial [[Windows]] [[GUI]] by [[ChessOK]], developed by [[Victor Zakharov]] and [[Vladimir Makhnychev]] <ref>[http://chessok.com/?page_id=27966 Lomonosov Endgame Tablebases - History Note] - [[ChessOK]]</ref>, supporting [[UCI]] and [[WinBoard]] engines. Aquarium is based on the fluent design as introduced by [[Microsoft]] [https://en.wikipedia.org/wiki/Microsoft_Office_2007 Office 2007], featuring a [https://en.wikipedia.org/wiki/Ribbon_%28computing%29 ribbon], which is a set of [https://en.wikipedia.org/wiki/Toolbar toolbars] placed on [https://en.wikipedia.org/wiki/Tab_%28GUI%29 tabs] <ref>[http://www.computerworld.com/s/article/9003994/Final_Review_The_Lowdown_on_Office_2007 Final Review: The Lowdown on Office 2007] by Richard Ericson, [[Computerworld]], October 11, 2006</ref> <ref>[http://www.exceluser.com/explore/surveys/ribbon/ribbon-survey-results.htm Excel 2007's Ribbon Hurts Productivity, Survey Shows] by [http://www.exceluser.com/contact/kyd.htm Charley Kyd], [http://www.exceluser.com/index.htm Excel User--Reports, analyses, charts, & formulas for business], May, 2009</ref>. Beside the ribbon, the main window is tiled by a navigation window with multiple [https://en.wikipedia.org/wiki/Paned_window panes] to switch modes and documents, and a bigger working area with various views on that documents, that is [[Databases|game databases]], generated game lists and move variation trees as result of database queries, and a single [[Chess Game|chess game]], optional with variation trees and [[Game Notation|notations]]. In game playing or analyzing mode, the working area is dominated by a board view associated with [https://en.wikipedia.org/wiki/Dock_%28computing%29 dockable] and [https://en.wikipedia.org/wiki/Stacking_window_manager stackable] notation-, multiple column tree-, header-, analysis- and information windows. First released as [https://en.wikipedia.org/wiki/Aquarium aquarium] for the [[:Category:Fish|fish]] dubbed [[Rybka]], the Aquarium GUI is also bundled with [[Houdini]] <ref>[http://chessok.com/ ChessOK.com: Chess shop from the developers of Rybka 4 Aquarium]</ref>. <br />
<br />
=Screenshot=<br />
[[FILE:DeepRybkaInAqurium7.png|none|border|text-bottom|640px|link=http://chessok.com/shop/index.php?main_page=product_info&cPath=7_1&products_id=440]] <br />
[[Rybka|Deep Rybka 4]] [[Aquarium]] by [[ChessOK]] <ref>[http://chessok.com/shop/index.php?main_page=product_info&cPath=7_1&products_id=440 ChessOK, Chess Shop from the Developers of Rybka 3 Aquarium]</ref> <br />
<br />
=Database=<br />
Aquarium has its own proprietary [[Databases|database]] format, and further supports the format of its stable mate [[Chess Assistant]], [[Portable Game Notation]] and [[Extended Position Description]]. It can read, query and import [[ChessBase (Database)|ChessBase]] [[ChessBase (Database)Formats|CBH-Format]]. Aquarium can probe endgame tablebases.<br />
<br />
=IDeA=<br />
Aquarium features an '''I'''nteractive '''De'''ep '''A'''nalysis dubbed '''IDeA''', using a permanent minimaxed analysis tree, the user can interactively explore and expand during of after analysis with various engines.<br />
<br />
=See also= <br />
* [[Arena]]<br />
* [[ChessGUI]]<br />
* [[Chess King]]<br />
* [[ChessPartner|ChessPartner GUI]]<br />
* [[Engine Testing]]<br />
* [[Fritz#FritzGUI|Fritz GUI]]<br />
* [[jose]]<br />
* [[Protocols]]<br />
* [[Shredder|Shredder GUI]]<br />
* [[UCI]]<br />
* [[WinBoard]]<br />
<br />
=Forum Posts=<br />
==2008 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=20531 Rybka 3 with new GUI: are they serious?!] by [[Jouni Uski]], [[CCC]], April 05, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22158 Rybka GUI (Aquarium)'s screenshots II] by Ulysses P., [[CCC]], July 05, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22834 Aquarium and Engine Matches] by [[Ted Summers]], [[CCC]], August 07, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32952 Aquarium (other GUIs too?) and WB support => I am shocked] by [[Miguel A. Ballicora]], [[CCC]], February 27, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=34728 UCI Implementation of Aquarium is broken (FEN positions)] by [[Miguel A. Ballicora]], [[CCC]], June 05, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=37029 is it worth going from aquarium 3 to 4] by Joseph, [[CCC]], December 10, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=40673 Houdini 2 Aquarium released] by [[Robert Houdart]], [[CCC]], October 08, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44989 Aquarium?] by Carl Langan, [[CCC]], September 03, 2012<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=2093 Aquarium IDEA, repetitions, and minimax over cycles] by kevinfat, [[Computer Chess Forums|OpenChess Forum]], September 17, 2012 » [[Repetitions]], [[Graph History Interaction]] <ref>[http://aquariumchess.com/tiki/tiki-index.php?page=IDeA IDeA : ChessOK Aquarium Tiki]</ref><br />
* [http://www.open-chess.org/viewtopic.php?f=7&t=2234 Opening Book (for Aquarium)] by andytl755, [[Computer Chess Forums|OpenChess Forum]], January 21, 2013 » [[Opening Book]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=28101 Aquarium 2014] by Shaun, [[Computer Chess Forums|Rybka Forum]], December 09, 2013<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=62321 Changes in Aquarium 2017?] by Matthew Friend, [[CCC]], November 29, 2016<br />
==2020 ...==<br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=83004&p=956374#p956374 Re: ChessAssistant is also available on runway 24] by Werewolf, [[CCC]], Saturday Dec 23, 2023<br />
<br />
=External Links=<br />
==Chess GUI==<br />
* [https://shop.chessok.com/index.php?main_page=product_info&cPath=7_43&products_id=967 Aquarium 2023] - ChessOK.com<br />
* [https://shop.chessok.com/index.php?main_page=index&cPath=7_56 Houdini 6 Aquarium 2019] - ChessOK.com » [[Houdini]]<br />
* [http://aquariumchess.com/tiki/tiki-index.php ChessOK Aquarium Tiki]<br />
* Aquarium 2016 by [[Carl Bicknell]], [https://en.wikipedia.org/wiki/YouTube YouTube] Video<br />
: {{#evu:https://www.youtube.com/watch?v=Z9-0ryeOTII|alignment=left|valignment=top}}<br />
<br />
==IDeA==<br />
* [http://aquariumchess.com/tiki/tiki-index.php?page=IDeA IDeA : ChessOK Aquarium Tiki]<br />
* IDea [https://en.wikipedia.org/wiki/YouTube YouTube] Video Series by [[Carl Bicknell]]<br />
# [https://youtu.be/MKzCMSlvQ-I IDeA Video 1 Introduction]<br />
# [https://youtu.be/VkFZ1inv7Ks Video 2: IDeA setup]<br />
# [https://youtu.be/jBZR1P8-c9E Video 3: Seeding an IDeA Project Manually]<br />
# [https://youtu.be/7zuGSIN5A4I Video 4: Seeding an IDeA Project using a database]<br />
# [https://youtu.be/sCihn2YmWKM Video 5: Seeding an IDeA Project using an Engine]<br />
# [https://youtu.be/JALGmMkUIXE Video 6: Which is the best IDeA Engine?]<br />
# [https://youtu.be/bWZ4LwO0DkU Appendix 1: Parallel Search and IDeA]<br />
# [https://youtu.be/q5Hmt-alnRE Appendix 2: Hyper Threading and IDeA]<br />
<br />
==Misc==<br />
* [https://en.wikipedia.org/wiki/Aquarium Aquarium from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Aquarium_%28disambiguation%29 Aquarium (disambiguation) from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
<br />
'''[[GUI|Up one Level]]'''<br />
b</div>Smatovichttps://www.chessprogramming.org/index.php?title=Chess_Assistant&diff=26880Chess Assistant2023-12-23T14:56:13Z<p>Smatovic: </p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Chess Assistant'''<br/><br />
'''[[Main Page|Home]] * [[Software]] * [[Databases]] * Chess Assistant'''<br />
<br />
[[FILE:CA13_2.jpg|border|right|thumb|link=http://chessok.com/?page_id=27628| Chess Assistant Screen <ref>[http://chessok.com/?page_id=27628 Chess Assistant 17 with Houdini 5: ChessOK]</ref> ]] <br />
<br />
'''Chess Assistant''',<br/><br />
a chess [[GUI]] and [[Databases|database]] developed by a team around [[Victor Zakharov]] since 1988 <ref>[http://chessok.com/?page_id=262 About - ChessOK.com]</ref>, commercially distributed via [[ChessOK]], a brand name of Convekta Ltd., early versions running under [[MS-DOS]], subsequent versions under [[Windows]]. The sophisticated GUI allows to search and edit games in a database, probe endgame tablebases, to edit games for web publishing, and is also [https://en.wikipedia.org/wiki/Front_and_back_ends front end] for playing chess online. <br />
<br />
Chess Assistant 24 is bundled with [[Rybka|Rybka 4 ]], [[Stockfish|Stockfish 16]], a database of millions of games, and a trial subscription to Chess King learn courses<ref>[https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]</ref>.<br />
<br />
=Database=<br />
The CA database is an organized collection of up to millions of [[Chess Game|chess games]], either in form of (compressed) [[Portable Game Notation|PGN]] as interchange format, or as proprietary CA-format, managing classifiers, tree-structures and datasets for [https://en.wikipedia.org/wiki/Data_mining data mining] and faster [[Chess Query Language|CQL]] access. <br />
<br />
=CA Engines=<br />
Chess Assistant is compatible with most modern chess engines. It supports a variety of [[Protocols|protocols]] such as the [[Chess Engine Communication Protocol]] and [[UCI]]. Various Chess Assistant versions were and are bundled with commercial and free engines over the time:<br />
<br />
* [[Chess Tiger]]<br />
* [[Crafty]]<br />
* [[Dragon (Chess Assistant)|Dragon (CA)]]<br />
* [[Ruffian]]<br />
* [[Rybka]]<br />
* [[Houdini]]<br />
* [[Stockfish]]<br />
<br />
=See also=<br />
* [[Aquarium]]<br />
* [[ChessBase (Database)]]<br />
* [[Chess King]]<br />
* [[Chess Query Language]]<br />
* [[jose]]<br />
* [[Lomonosov Tablebases]]<br />
* [[NICBase]]<br />
* [[Portable Game Notation]]<br />
* [[SCID]]<br />
: [[ChessDB]] <br />
: [[ChessX]]<br />
: [[Scid vs. PC]]<br />
* [[TascBase]]<br />
<br />
=Reviews=<br />
* [http://ca.chessok.com/AuthorBios/BobPawlak.html Articles] by [[Robert Pawlak]]<br />
* [http://www.jovanpetronic.com/download/chessreviews/Convekta%20Chess%20Assistant%209%20Professional%20package%20review%20IM.FST.%20Jovan%20Petronic.pdf Convekta-Chess Assistant 9 Professional package review] (pdf) by [http://www.jovanpetronic.com/ Jovan Petronic], 2006<br />
<br />
=Forum Posts=<br />
==1990 ...==<br />
* [https://groups.google.com/d/msg/rec.games.chess/OC2DrsN7wkA/b60hK_ErcoAJ CHESS ASSISTANT vs ChessBase, NicBase?] by CCHB, [[Computer Chess Forums|rgc]], September 27, 1993 » [[ChessBase (Database)]], [[NICBase]]<br />
* [https://groups.google.com/d/msg/rec.games.chess/Z72gdE4292Q/hAaa0d_PgisJ ChessBase, ChessAssistant and NicBase] by Richard Reich, [[Computer Chess Forums|rgc]], December 03, 1994<br />
* [https://www.stmintz.com/ccc/index.php?id=14913 Chess Assistant 3.0*] by [[Albert Silver]], [[CCC]], February 06, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=19068 illegal copies of Chess Assistant program] by [[Sergey Abramov]], [[CCC]], May 22, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=83186 Chess Assistant 5.0 (Chess Tiger 12.0 is included) in January, 2000] by [[Victor Zakharov]], [[CCC]], December 18, 1999 » [[Chess Tiger]]<br />
==2000 ...==<br />
* [https://www.stmintz.com/ccc/index.php?id=161551 Chess Assistant 6 / Tiger14 / Tablebases FAQ] by [[Victor Zakharov]], [[CCC]], April 03, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=181150 Chessbase 8 or Chess Assistant 6?] by John Dahlem, [[CCC]], July 25, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=184666 Chess Assistant 6.1 is available] by [[Victor Zakharov]], [[CCC]], August 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=192532 *FREE* Chess Assistant Light is out!] by [[Albert Silver]], [[CCC]], October 09, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=203012 Shredder 6 works perfectly in Chess Assistant 6] by [[Albert Silver]], [[CCC]], December 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=264805 Chess Assistant 7 is awesome] by George Sobala, [[CCC]], November 13, 2002<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=28453 ChessAssistant under Linux with wine] by [[Kurt Utzinger]], [[CCC]], June 16, 2009 <br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=83004 ChessAssistant is also available on runway 24] by [[Frank Quisinsky]], [[CCC]], December 12, 2023<br />
<br />
=External Links=<br />
* [https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]<br />
* [http://chessok.com/?page_id=19894 Chess Assistant - ChessOK.com]<br />
* [http://chessok.com/rolik/ca/content.html Chess Assistant - Video Tutorials]<br />
* [https://en.wikipedia.org/wiki/Chess_Assistant Chess Assistant from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
'''[[GUI|Up one Level]]'''<br />
[[Category:Commercial]]<br />
[[Category:Database]]<br />
[[Category:ChessOK]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Chess_Assistant&diff=26879Chess Assistant2023-12-23T14:55:20Z<p>Smatovic: </p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Chess Assistant'''<br/><br />
'''[[Main Page|Home]] * [[Software]] * [[Databases]] * Chess Assistant'''<br />
<br />
[[FILE:CA13_2.jpg|border|right|thumb|link=http://chessok.com/?page_id=27628| Chess Assistant Screen <ref>[http://chessok.com/?page_id=27628 Chess Assistant 17 with Houdini 5: ChessOK]</ref> ]] <br />
<br />
'''Chess Assistant''',<br/><br />
a chess [[GUI]] and [[Databases|database]] developed by a team around [[Victor Zakharov]] since 1988 <ref>[http://chessok.com/?page_id=262 About - ChessOK.com]</ref>, commercially distributed via [[ChessOK]], a brand name of Convekta Ltd., early versions running under [[MS-DOS]], subsequent versions under [[Windows]]. The sophisticated GUI allows to search and edit games in a database, to edit games for web publishing, probe endgame tablebases, and is also [https://en.wikipedia.org/wiki/Front_and_back_ends front end] for playing chess online. <br />
<br />
Chess Assistant 24 is bundled with [[Rybka|Rybka 4 ]], [[Stockfish|Stockfish 16]], a database of millions of games, and a trial subscription to Chess King learn courses<ref>[https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]</ref>.<br />
<br />
=Database=<br />
The CA database is an organized collection of up to millions of [[Chess Game|chess games]], either in form of (compressed) [[Portable Game Notation|PGN]] as interchange format, or as proprietary CA-format, managing classifiers, tree-structures and datasets for [https://en.wikipedia.org/wiki/Data_mining data mining] and faster [[Chess Query Language|CQL]] access. <br />
<br />
=CA Engines=<br />
Chess Assistant is compatible with most modern chess engines. It supports a variety of [[Protocols|protocols]] such as the [[Chess Engine Communication Protocol]] and [[UCI]]. Various Chess Assistant versions were and are bundled with commercial and free engines over the time:<br />
<br />
* [[Chess Tiger]]<br />
* [[Crafty]]<br />
* [[Dragon (Chess Assistant)|Dragon (CA)]]<br />
* [[Ruffian]]<br />
* [[Rybka]]<br />
* [[Houdini]]<br />
* [[Stockfish]]<br />
<br />
=See also=<br />
* [[Aquarium]]<br />
* [[ChessBase (Database)]]<br />
* [[Chess King]]<br />
* [[Chess Query Language]]<br />
* [[jose]]<br />
* [[Lomonosov Tablebases]]<br />
* [[NICBase]]<br />
* [[Portable Game Notation]]<br />
* [[SCID]]<br />
: [[ChessDB]] <br />
: [[ChessX]]<br />
: [[Scid vs. PC]]<br />
* [[TascBase]]<br />
<br />
=Reviews=<br />
* [http://ca.chessok.com/AuthorBios/BobPawlak.html Articles] by [[Robert Pawlak]]<br />
* [http://www.jovanpetronic.com/download/chessreviews/Convekta%20Chess%20Assistant%209%20Professional%20package%20review%20IM.FST.%20Jovan%20Petronic.pdf Convekta-Chess Assistant 9 Professional package review] (pdf) by [http://www.jovanpetronic.com/ Jovan Petronic], 2006<br />
<br />
=Forum Posts=<br />
==1990 ...==<br />
* [https://groups.google.com/d/msg/rec.games.chess/OC2DrsN7wkA/b60hK_ErcoAJ CHESS ASSISTANT vs ChessBase, NicBase?] by CCHB, [[Computer Chess Forums|rgc]], September 27, 1993 » [[ChessBase (Database)]], [[NICBase]]<br />
* [https://groups.google.com/d/msg/rec.games.chess/Z72gdE4292Q/hAaa0d_PgisJ ChessBase, ChessAssistant and NicBase] by Richard Reich, [[Computer Chess Forums|rgc]], December 03, 1994<br />
* [https://www.stmintz.com/ccc/index.php?id=14913 Chess Assistant 3.0*] by [[Albert Silver]], [[CCC]], February 06, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=19068 illegal copies of Chess Assistant program] by [[Sergey Abramov]], [[CCC]], May 22, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=83186 Chess Assistant 5.0 (Chess Tiger 12.0 is included) in January, 2000] by [[Victor Zakharov]], [[CCC]], December 18, 1999 » [[Chess Tiger]]<br />
==2000 ...==<br />
* [https://www.stmintz.com/ccc/index.php?id=161551 Chess Assistant 6 / Tiger14 / Tablebases FAQ] by [[Victor Zakharov]], [[CCC]], April 03, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=181150 Chessbase 8 or Chess Assistant 6?] by John Dahlem, [[CCC]], July 25, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=184666 Chess Assistant 6.1 is available] by [[Victor Zakharov]], [[CCC]], August 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=192532 *FREE* Chess Assistant Light is out!] by [[Albert Silver]], [[CCC]], October 09, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=203012 Shredder 6 works perfectly in Chess Assistant 6] by [[Albert Silver]], [[CCC]], December 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=264805 Chess Assistant 7 is awesome] by George Sobala, [[CCC]], November 13, 2002<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=28453 ChessAssistant under Linux with wine] by [[Kurt Utzinger]], [[CCC]], June 16, 2009 <br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=83004 ChessAssistant is also available on runway 24] by [[Frank Quisinsky]], [[CCC]], December 12, 2023<br />
<br />
=External Links=<br />
* [https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]<br />
* [http://chessok.com/?page_id=19894 Chess Assistant - ChessOK.com]<br />
* [http://chessok.com/rolik/ca/content.html Chess Assistant - Video Tutorials]<br />
* [https://en.wikipedia.org/wiki/Chess_Assistant Chess Assistant from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
'''[[GUI|Up one Level]]'''<br />
[[Category:Commercial]]<br />
[[Category:Database]]<br />
[[Category:ChessOK]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Aquarium&diff=26878Aquarium2023-12-23T14:53:32Z<p>Smatovic: /* Chess GUI */</p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Aquarium'''<br />
<br />
[[FILE:Amaterske akvarium.jpg|border|right|thumb| Aquarium with plants and tropical fish <ref>[https://en.wikipedia.org/wiki/Aquarium Aquarium from Wikipedia]</ref> ]] <br />
<br />
'''Aquarium''',<br/><br />
a sophisticated, commercial [[Windows]] [[GUI]] by [[ChessOK]], developed by [[Victor Zakharov]] and [[Vladimir Makhnychev]] <ref>[http://chessok.com/?page_id=27966 Lomonosov Endgame Tablebases - History Note] - [[ChessOK]]</ref>, supporting [[UCI]] and [[WinBoard]] engines. Aquarium is based on the fluent design as introduced by [[Microsoft]] [https://en.wikipedia.org/wiki/Microsoft_Office_2007 Office 2007], featuring a [https://en.wikipedia.org/wiki/Ribbon_%28computing%29 ribbon], which is a set of [https://en.wikipedia.org/wiki/Toolbar toolbars] placed on [https://en.wikipedia.org/wiki/Tab_%28GUI%29 tabs] <ref>[http://www.computerworld.com/s/article/9003994/Final_Review_The_Lowdown_on_Office_2007 Final Review: The Lowdown on Office 2007] by Richard Ericson, [[Computerworld]], October 11, 2006</ref> <ref>[http://www.exceluser.com/explore/surveys/ribbon/ribbon-survey-results.htm Excel 2007's Ribbon Hurts Productivity, Survey Shows] by [http://www.exceluser.com/contact/kyd.htm Charley Kyd], [http://www.exceluser.com/index.htm Excel User--Reports, analyses, charts, & formulas for business], May, 2009</ref>. Beside the ribbon, the main window is tiled by a navigation window with multiple [https://en.wikipedia.org/wiki/Paned_window panes] to switch modes and documents, and a bigger working area with various views on that documents, that is [[Databases|game databases]], generated game lists and move variation trees as result of database queries, and a single [[Chess Game|chess game]], optional with variation trees and [[Game Notation|notations]]. In game playing or analyzing mode, the working area is dominated by a board view associated with [https://en.wikipedia.org/wiki/Dock_%28computing%29 dockable] and [https://en.wikipedia.org/wiki/Stacking_window_manager stackable] notation-, multiple column tree-, header-, analysis- and information windows. First released as [https://en.wikipedia.org/wiki/Aquarium aquarium] for the [[:Category:Fish|fish]] dubbed [[Rybka]], the Aquarium GUI is also bundled with [[Houdini]] <ref>[http://chessok.com/ ChessOK.com: Chess shop from the developers of Rybka 4 Aquarium]</ref>. <br />
<br />
=Screenshot=<br />
[[FILE:DeepRybkaInAqurium7.png|none|border|text-bottom|640px|link=http://chessok.com/shop/index.php?main_page=product_info&cPath=7_1&products_id=440]] <br />
[[Rybka|Deep Rybka 4]] [[Aquarium]] by [[ChessOK]] <ref>[http://chessok.com/shop/index.php?main_page=product_info&cPath=7_1&products_id=440 ChessOK, Chess Shop from the Developers of Rybka 3 Aquarium]</ref> <br />
<br />
=Database=<br />
Aquarium has its own proprietary [[Databases|database]] format, and further supports the format of its stable mate [[Chess Assistant]], [[Portable Game Notation]] and [[Extended Position Description]]. It can read, query and import [[ChessBase (Database)|ChessBase]] [[ChessBase (Database)Formats|CBH-Format]]. Aquarium can probe endgame tablebases.<br />
<br />
=IDeA=<br />
Aquarium features an '''I'''nteractive '''De'''ep '''A'''nalysis dubbed '''IDeA''', using a permanent minimaxed analysis tree, the user can interactively explore and expand during of after analysis with various engines.<br />
<br />
=See also= <br />
* [[Arena]]<br />
* [[ChessGUI]]<br />
* [[Chess King]]<br />
* [[ChessPartner|ChessPartner GUI]]<br />
* [[Engine Testing]]<br />
* [[Fritz#FritzGUI|Fritz GUI]]<br />
* [[jose]]<br />
* [[Protocols]]<br />
* [[Shredder|Shredder GUI]]<br />
* [[UCI]]<br />
* [[WinBoard]]<br />
<br />
=Forum Posts=<br />
==2008 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=20531 Rybka 3 with new GUI: are they serious?!] by [[Jouni Uski]], [[CCC]], April 05, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22158 Rybka GUI (Aquarium)'s screenshots II] by Ulysses P., [[CCC]], July 05, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22834 Aquarium and Engine Matches] by [[Ted Summers]], [[CCC]], August 07, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32952 Aquarium (other GUIs too?) and WB support => I am shocked] by [[Miguel A. Ballicora]], [[CCC]], February 27, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=34728 UCI Implementation of Aquarium is broken (FEN positions)] by [[Miguel A. Ballicora]], [[CCC]], June 05, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=37029 is it worth going from aquarium 3 to 4] by Joseph, [[CCC]], December 10, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=40673 Houdini 2 Aquarium released] by [[Robert Houdart]], [[CCC]], October 08, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44989 Aquarium?] by Carl Langan, [[CCC]], September 03, 2012<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=2093 Aquarium IDEA, repetitions, and minimax over cycles] by kevinfat, [[Computer Chess Forums|OpenChess Forum]], September 17, 2012 » [[Repetitions]], [[Graph History Interaction]] <ref>[http://aquariumchess.com/tiki/tiki-index.php?page=IDeA IDeA : ChessOK Aquarium Tiki]</ref><br />
* [http://www.open-chess.org/viewtopic.php?f=7&t=2234 Opening Book (for Aquarium)] by andytl755, [[Computer Chess Forums|OpenChess Forum]], January 21, 2013 » [[Opening Book]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=28101 Aquarium 2014] by Shaun, [[Computer Chess Forums|Rybka Forum]], December 09, 2013<br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=62321 Changes in Aquarium 2017?] by Matthew Friend, [[CCC]], November 29, 2016<br />
<br />
=External Links=<br />
==Chess GUI==<br />
* [https://shop.chessok.com/index.php?main_page=product_info&cPath=7_43&products_id=967 Aquarium 2023 - ChessOK.com]<br />
* [http://chessok.com/shop/index.php?main_page=index&cPath=7_56 Houdini Aquarium], [[ChessOK]], December 01, 2016 » [[Houdini]]<br />
* [http://aquariumchess.com/tiki/tiki-index.php ChessOK Aquarium Tiki]<br />
* Aquarium 2016 by [[Carl Bicknell]], [https://en.wikipedia.org/wiki/YouTube YouTube] Video<br />
: {{#evu:https://www.youtube.com/watch?v=Z9-0ryeOTII|alignment=left|valignment=top}}<br />
<br />
==IDeA==<br />
* [http://aquariumchess.com/tiki/tiki-index.php?page=IDeA IDeA : ChessOK Aquarium Tiki]<br />
* IDea [https://en.wikipedia.org/wiki/YouTube YouTube] Video Series by [[Carl Bicknell]]<br />
# [https://youtu.be/MKzCMSlvQ-I IDeA Video 1 Introduction]<br />
# [https://youtu.be/VkFZ1inv7Ks Video 2: IDeA setup]<br />
# [https://youtu.be/jBZR1P8-c9E Video 3: Seeding an IDeA Project Manually]<br />
# [https://youtu.be/7zuGSIN5A4I Video 4: Seeding an IDeA Project using a database]<br />
# [https://youtu.be/sCihn2YmWKM Video 5: Seeding an IDeA Project using an Engine]<br />
# [https://youtu.be/JALGmMkUIXE Video 6: Which is the best IDeA Engine?]<br />
# [https://youtu.be/bWZ4LwO0DkU Appendix 1: Parallel Search and IDeA]<br />
# [https://youtu.be/q5Hmt-alnRE Appendix 2: Hyper Threading and IDeA]<br />
<br />
==Misc==<br />
* [https://en.wikipedia.org/wiki/Aquarium Aquarium from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Aquarium_%28disambiguation%29 Aquarium (disambiguation) from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
<br />
'''[[GUI|Up one Level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Aquarium&diff=26877Aquarium2023-12-23T14:50:29Z<p>Smatovic: /* Database */</p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Aquarium'''<br />
<br />
[[FILE:Amaterske akvarium.jpg|border|right|thumb| Aquarium with plants and tropical fish <ref>[https://en.wikipedia.org/wiki/Aquarium Aquarium from Wikipedia]</ref> ]] <br />
<br />
'''Aquarium''',<br/><br />
a sophisticated, commercial [[Windows]] [[GUI]] by [[ChessOK]], developed by [[Victor Zakharov]] and [[Vladimir Makhnychev]] <ref>[http://chessok.com/?page_id=27966 Lomonosov Endgame Tablebases - History Note] - [[ChessOK]]</ref>, supporting [[UCI]] and [[WinBoard]] engines. Aquarium is based on the fluent design as introduced by [[Microsoft]] [https://en.wikipedia.org/wiki/Microsoft_Office_2007 Office 2007], featuring a [https://en.wikipedia.org/wiki/Ribbon_%28computing%29 ribbon], which is a set of [https://en.wikipedia.org/wiki/Toolbar toolbars] placed on [https://en.wikipedia.org/wiki/Tab_%28GUI%29 tabs] <ref>[http://www.computerworld.com/s/article/9003994/Final_Review_The_Lowdown_on_Office_2007 Final Review: The Lowdown on Office 2007] by Richard Ericson, [[Computerworld]], October 11, 2006</ref> <ref>[http://www.exceluser.com/explore/surveys/ribbon/ribbon-survey-results.htm Excel 2007's Ribbon Hurts Productivity, Survey Shows] by [http://www.exceluser.com/contact/kyd.htm Charley Kyd], [http://www.exceluser.com/index.htm Excel User--Reports, analyses, charts, & formulas for business], May, 2009</ref>. Beside the ribbon, the main window is tiled by a navigation window with multiple [https://en.wikipedia.org/wiki/Paned_window panes] to switch modes and documents, and a bigger working area with various views on that documents, that is [[Databases|game databases]], generated game lists and move variation trees as result of database queries, and a single [[Chess Game|chess game]], optional with variation trees and [[Game Notation|notations]]. In game playing or analyzing mode, the working area is dominated by a board view associated with [https://en.wikipedia.org/wiki/Dock_%28computing%29 dockable] and [https://en.wikipedia.org/wiki/Stacking_window_manager stackable] notation-, multiple column tree-, header-, analysis- and information windows. First released as [https://en.wikipedia.org/wiki/Aquarium aquarium] for the [[:Category:Fish|fish]] dubbed [[Rybka]], the Aquarium GUI is also bundled with [[Houdini]] <ref>[http://chessok.com/ ChessOK.com: Chess shop from the developers of Rybka 4 Aquarium]</ref>. <br />
<br />
=Screenshot=<br />
[[FILE:DeepRybkaInAqurium7.png|none|border|text-bottom|640px|link=http://chessok.com/shop/index.php?main_page=product_info&cPath=7_1&products_id=440]] <br />
[[Rybka|Deep Rybka 4]] [[Aquarium]] by [[ChessOK]] <ref>[http://chessok.com/shop/index.php?main_page=product_info&cPath=7_1&products_id=440 ChessOK, Chess Shop from the Developers of Rybka 3 Aquarium]</ref> <br />
<br />
=Database=<br />
Aquarium has its own proprietary [[Databases|database]] format, and further supports the format of its stable mate [[Chess Assistant]], [[Portable Game Notation]] and [[Extended Position Description]]. It can read, query and import [[ChessBase (Database)|ChessBase]] [[ChessBase (Database)Formats|CBH-Format]]. Aquarium can probe endgame tablebases.<br />
<br />
=IDeA=<br />
Aquarium features an '''I'''nteractive '''De'''ep '''A'''nalysis dubbed '''IDeA''', using a permanent minimaxed analysis tree, the user can interactively explore and expand during of after analysis with various engines.<br />
<br />
=See also= <br />
* [[Arena]]<br />
* [[ChessGUI]]<br />
* [[Chess King]]<br />
* [[ChessPartner|ChessPartner GUI]]<br />
* [[Engine Testing]]<br />
* [[Fritz#FritzGUI|Fritz GUI]]<br />
* [[jose]]<br />
* [[Protocols]]<br />
* [[Shredder|Shredder GUI]]<br />
* [[UCI]]<br />
* [[WinBoard]]<br />
<br />
=Forum Posts=<br />
==2008 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=20531 Rybka 3 with new GUI: are they serious?!] by [[Jouni Uski]], [[CCC]], April 05, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22158 Rybka GUI (Aquarium)'s screenshots II] by Ulysses P., [[CCC]], July 05, 2008<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=22834 Aquarium and Engine Matches] by [[Ted Summers]], [[CCC]], August 07, 2008<br />
==2010 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=32952 Aquarium (other GUIs too?) and WB support => I am shocked] by [[Miguel A. Ballicora]], [[CCC]], February 27, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=34728 UCI Implementation of Aquarium is broken (FEN positions)] by [[Miguel A. Ballicora]], [[CCC]], June 05, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=37029 is it worth going from aquarium 3 to 4] by Joseph, [[CCC]], December 10, 2010<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=40673 Houdini 2 Aquarium released] by [[Robert Houdart]], [[CCC]], October 08, 2011<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=44989 Aquarium?] by Carl Langan, [[CCC]], September 03, 2012<br />
* [http://www.open-chess.org/viewtopic.php?f=5&t=2093 Aquarium IDEA, repetitions, and minimax over cycles] by kevinfat, [[Computer Chess Forums|OpenChess Forum]], September 17, 2012 » [[Repetitions]], [[Graph History Interaction]] <ref>[http://aquariumchess.com/tiki/tiki-index.php?page=IDeA IDeA : ChessOK Aquarium Tiki]</ref><br />
* [http://www.open-chess.org/viewtopic.php?f=7&t=2234 Opening Book (for Aquarium)] by andytl755, [[Computer Chess Forums|OpenChess Forum]], January 21, 2013 » [[Opening Book]]<br />
* [http://rybkaforum.net/cgi-bin/rybkaforum/topic_show.pl?tid=28101 Aquarium 2014] by Shaun, [[Computer Chess Forums|Rybka Forum]], December 09, 2013<br />
==2015 ...==<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=62321 Changes in Aquarium 2017?] by Matthew Friend, [[CCC]], November 29, 2016<br />
<br />
=External Links=<br />
==Chess GUI==<br />
* [http://chessok.com/shop/index.php?main_page=index&cPath=7_43 Aquarium 2018 : ChessOK, Chess Shop from the Developers of Houdini 5 Aquarium]<br />
* [http://chessok.com/shop/index.php?main_page=index&cPath=7_56 Houdini Aquarium], [[ChessOK]], December 01, 2016 » [[Houdini]]<br />
* [http://aquariumchess.com/tiki/tiki-index.php ChessOK Aquarium Tiki]<br />
* Aquarium 2016 by [[Carl Bicknell]], [https://en.wikipedia.org/wiki/YouTube YouTube] Video<br />
: {{#evu:https://www.youtube.com/watch?v=Z9-0ryeOTII|alignment=left|valignment=top}}<br />
==IDeA==<br />
* [http://aquariumchess.com/tiki/tiki-index.php?page=IDeA IDeA : ChessOK Aquarium Tiki]<br />
* IDea [https://en.wikipedia.org/wiki/YouTube YouTube] Video Series by [[Carl Bicknell]]<br />
# [https://youtu.be/MKzCMSlvQ-I IDeA Video 1 Introduction]<br />
# [https://youtu.be/VkFZ1inv7Ks Video 2: IDeA setup]<br />
# [https://youtu.be/jBZR1P8-c9E Video 3: Seeding an IDeA Project Manually]<br />
# [https://youtu.be/7zuGSIN5A4I Video 4: Seeding an IDeA Project using a database]<br />
# [https://youtu.be/sCihn2YmWKM Video 5: Seeding an IDeA Project using an Engine]<br />
# [https://youtu.be/JALGmMkUIXE Video 6: Which is the best IDeA Engine?]<br />
# [https://youtu.be/bWZ4LwO0DkU Appendix 1: Parallel Search and IDeA]<br />
# [https://youtu.be/q5Hmt-alnRE Appendix 2: Hyper Threading and IDeA]<br />
<br />
==Misc==<br />
* [https://en.wikipedia.org/wiki/Aquarium Aquarium from Wikipedia]<br />
* [https://en.wikipedia.org/wiki/Aquarium_%28disambiguation%29 Aquarium (disambiguation) from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
<br />
'''[[GUI|Up one Level]]'''</div>Smatovichttps://www.chessprogramming.org/index.php?title=Chess_Assistant&diff=26876Chess Assistant2023-12-23T14:49:15Z<p>Smatovic: </p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Chess Assistant'''<br/><br />
'''[[Main Page|Home]] * [[Software]] * [[Databases]] * Chess Assistant'''<br />
<br />
[[FILE:CA13_2.jpg|border|right|thumb|link=http://chessok.com/?page_id=27628| Chess Assistant Screen <ref>[http://chessok.com/?page_id=27628 Chess Assistant 17 with Houdini 5: ChessOK]</ref> ]] <br />
<br />
'''Chess Assistant''',<br/><br />
a chess [[GUI]] and [[Databases|database]] developed by a team around [[Victor Zakharov]] since 1988 <ref>[http://chessok.com/?page_id=262 About - ChessOK.com]</ref>, commercially distributed via [[ChessOK]], a brand name of Convekta Ltd., early versions running under [[MS-DOS]], subsequent versions under [[Windows]]. The sophisticated GUI allows to search and edit games in a database, to edit games for web publishing, query endgame tablebases, and is also [https://en.wikipedia.org/wiki/Front_and_back_ends front end] for playing chess online. <br />
<br />
Chess Assistant 24 is bundled with [[Rybka|Rybka 4 ]], [[Stockfish|Stockfish 16]], a database of millions of games, and a trial subscription to Chess King learn courses<ref>[https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]</ref>.<br />
<br />
=Database=<br />
The CA database is an organized collection of up to millions of [[Chess Game|chess games]], either in form of (compressed) [[Portable Game Notation|PGN]] as interchange format, or as proprietary CA-format, managing classifiers, tree-structures and datasets for [https://en.wikipedia.org/wiki/Data_mining data mining] and faster [[Chess Query Language|CQL]] access. <br />
<br />
=CA Engines=<br />
Chess Assistant is compatible with most modern chess engines. It supports a variety of [[Protocols|protocols]] such as the [[Chess Engine Communication Protocol]] and [[UCI]]. Various Chess Assistant versions were and are bundled with commercial and free engines over the time:<br />
<br />
* [[Chess Tiger]]<br />
* [[Crafty]]<br />
* [[Dragon (Chess Assistant)|Dragon (CA)]]<br />
* [[Ruffian]]<br />
* [[Rybka]]<br />
* [[Houdini]]<br />
* [[Stockfish]]<br />
<br />
=See also=<br />
* [[Aquarium]]<br />
* [[ChessBase (Database)]]<br />
* [[Chess King]]<br />
* [[Chess Query Language]]<br />
* [[jose]]<br />
* [[Lomonosov Tablebases]]<br />
* [[NICBase]]<br />
* [[Portable Game Notation]]<br />
* [[SCID]]<br />
: [[ChessDB]] <br />
: [[ChessX]]<br />
: [[Scid vs. PC]]<br />
* [[TascBase]]<br />
<br />
=Reviews=<br />
* [http://ca.chessok.com/AuthorBios/BobPawlak.html Articles] by [[Robert Pawlak]]<br />
* [http://www.jovanpetronic.com/download/chessreviews/Convekta%20Chess%20Assistant%209%20Professional%20package%20review%20IM.FST.%20Jovan%20Petronic.pdf Convekta-Chess Assistant 9 Professional package review] (pdf) by [http://www.jovanpetronic.com/ Jovan Petronic], 2006<br />
<br />
=Forum Posts=<br />
==1990 ...==<br />
* [https://groups.google.com/d/msg/rec.games.chess/OC2DrsN7wkA/b60hK_ErcoAJ CHESS ASSISTANT vs ChessBase, NicBase?] by CCHB, [[Computer Chess Forums|rgc]], September 27, 1993 » [[ChessBase (Database)]], [[NICBase]]<br />
* [https://groups.google.com/d/msg/rec.games.chess/Z72gdE4292Q/hAaa0d_PgisJ ChessBase, ChessAssistant and NicBase] by Richard Reich, [[Computer Chess Forums|rgc]], December 03, 1994<br />
* [https://www.stmintz.com/ccc/index.php?id=14913 Chess Assistant 3.0*] by [[Albert Silver]], [[CCC]], February 06, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=19068 illegal copies of Chess Assistant program] by [[Sergey Abramov]], [[CCC]], May 22, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=83186 Chess Assistant 5.0 (Chess Tiger 12.0 is included) in January, 2000] by [[Victor Zakharov]], [[CCC]], December 18, 1999 » [[Chess Tiger]]<br />
==2000 ...==<br />
* [https://www.stmintz.com/ccc/index.php?id=161551 Chess Assistant 6 / Tiger14 / Tablebases FAQ] by [[Victor Zakharov]], [[CCC]], April 03, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=181150 Chessbase 8 or Chess Assistant 6?] by John Dahlem, [[CCC]], July 25, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=184666 Chess Assistant 6.1 is available] by [[Victor Zakharov]], [[CCC]], August 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=192532 *FREE* Chess Assistant Light is out!] by [[Albert Silver]], [[CCC]], October 09, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=203012 Shredder 6 works perfectly in Chess Assistant 6] by [[Albert Silver]], [[CCC]], December 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=264805 Chess Assistant 7 is awesome] by George Sobala, [[CCC]], November 13, 2002<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=28453 ChessAssistant under Linux with wine] by [[Kurt Utzinger]], [[CCC]], June 16, 2009 <br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=83004 ChessAssistant is also available on runway 24] by [[Frank Quisinsky]], [[CCC]], December 12, 2023<br />
<br />
=External Links=<br />
* [https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]<br />
* [http://chessok.com/?page_id=19894 Chess Assistant - ChessOK.com]<br />
* [http://chessok.com/rolik/ca/content.html Chess Assistant - Video Tutorials]<br />
* [https://en.wikipedia.org/wiki/Chess_Assistant Chess Assistant from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
'''[[GUI|Up one Level]]'''<br />
[[Category:Commercial]]<br />
[[Category:Database]]<br />
[[Category:ChessOK]]</div>Smatovichttps://www.chessprogramming.org/index.php?title=Chess_Assistant&diff=26875Chess Assistant2023-12-23T14:43:01Z<p>Smatovic: CA 24 update</p>
<hr />
<div>'''[[Main Page|Home]] * [[User Interface]] * [[GUI]] * Chess Assistant'''<br/><br />
'''[[Main Page|Home]] * [[Software]] * [[Databases]] * Chess Assistant'''<br />
<br />
[[FILE:CA13_2.jpg|border|right|thumb|link=http://chessok.com/?page_id=27628| Chess Assistant Screen <ref>[http://chessok.com/?page_id=27628 Chess Assistant 17 with Houdini 5: ChessOK]</ref> ]] <br />
<br />
'''Chess Assistant''',<br/><br />
a chess [[GUI]] and [[Databases|database]] developed by a team around [[Victor Zakharov]] since 1988 <ref>[http://chessok.com/?page_id=262 About - ChessOK.com]</ref>, commercially distributed via [[ChessOK]], a brand name of Convekta Ltd., early versions running under [[MS-DOS]], subsequent versions under [[Windows]]. The sophisticated GUI allows to search and edit games in a database, to edit games for web publishing, and is also [https://en.wikipedia.org/wiki/Front_and_back_ends front end] for playing chess online. Chess Assistant 24 is bundled with [[Rybka|Rybka 4 ]], [[Stockfish|Stockfish 16]], a database of millions of games, and a trial subscription to Chess King learn courses<ref>[https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]</ref>.<br />
<br />
=Database=<br />
The CA database is an organized collection of up to millions of [[Chess Game|chess games]], either in form of (compressed) [[Portable Game Notation|PGN]] as interchange format, or as proprietary CA-format, managing classifiers, tree-structures and datasets for [https://en.wikipedia.org/wiki/Data_mining data mining] and faster [[Chess Query Language|CQL]] access. <br />
<br />
=CA Engines=<br />
Chess Assistant is compatible with most modern chess engines. It supports a variety of [[Protocols|protocols]] such as the [[Chess Engine Communication Protocol]] and [[UCI]]. Various Chess Assistant versions were and are bundled with commercial and free engines over the time:<br />
<br />
* [[Chess Tiger]]<br />
* [[Crafty]]<br />
* [[Dragon (Chess Assistant)|Dragon (CA)]]<br />
* [[Ruffian]]<br />
* [[Rybka]]<br />
* [[Houdini]]<br />
* [[Stockfish]]<br />
<br />
=See also=<br />
* [[Aquarium]]<br />
* [[ChessBase (Database)]]<br />
* [[Chess King]]<br />
* [[Chess Query Language]]<br />
* [[jose]]<br />
* [[Lomonosov Tablebases]]<br />
* [[NICBase]]<br />
* [[Portable Game Notation]]<br />
* [[SCID]]<br />
: [[ChessDB]] <br />
: [[ChessX]]<br />
: [[Scid vs. PC]]<br />
* [[TascBase]]<br />
<br />
=Reviews=<br />
* [http://ca.chessok.com/AuthorBios/BobPawlak.html Articles] by [[Robert Pawlak]]<br />
* [http://www.jovanpetronic.com/download/chessreviews/Convekta%20Chess%20Assistant%209%20Professional%20package%20review%20IM.FST.%20Jovan%20Petronic.pdf Convekta-Chess Assistant 9 Professional package review] (pdf) by [http://www.jovanpetronic.com/ Jovan Petronic], 2006<br />
<br />
=Forum Posts=<br />
==1990 ...==<br />
* [https://groups.google.com/d/msg/rec.games.chess/OC2DrsN7wkA/b60hK_ErcoAJ CHESS ASSISTANT vs ChessBase, NicBase?] by CCHB, [[Computer Chess Forums|rgc]], September 27, 1993 » [[ChessBase (Database)]], [[NICBase]]<br />
* [https://groups.google.com/d/msg/rec.games.chess/Z72gdE4292Q/hAaa0d_PgisJ ChessBase, ChessAssistant and NicBase] by Richard Reich, [[Computer Chess Forums|rgc]], December 03, 1994<br />
* [https://www.stmintz.com/ccc/index.php?id=14913 Chess Assistant 3.0*] by [[Albert Silver]], [[CCC]], February 06, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=19068 illegal copies of Chess Assistant program] by [[Sergey Abramov]], [[CCC]], May 22, 1998<br />
* [https://www.stmintz.com/ccc/index.php?id=83186 Chess Assistant 5.0 (Chess Tiger 12.0 is included) in January, 2000] by [[Victor Zakharov]], [[CCC]], December 18, 1999 » [[Chess Tiger]]<br />
==2000 ...==<br />
* [https://www.stmintz.com/ccc/index.php?id=161551 Chess Assistant 6 / Tiger14 / Tablebases FAQ] by [[Victor Zakharov]], [[CCC]], April 03, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=181150 Chessbase 8 or Chess Assistant 6?] by John Dahlem, [[CCC]], July 25, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=184666 Chess Assistant 6.1 is available] by [[Victor Zakharov]], [[CCC]], August 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=192532 *FREE* Chess Assistant Light is out!] by [[Albert Silver]], [[CCC]], October 09, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=203012 Shredder 6 works perfectly in Chess Assistant 6] by [[Albert Silver]], [[CCC]], December 21, 2001<br />
* [https://www.stmintz.com/ccc/index.php?id=264805 Chess Assistant 7 is awesome] by George Sobala, [[CCC]], November 13, 2002<br />
* [http://www.talkchess.com/forum/viewtopic.php?t=28453 ChessAssistant under Linux with wine] by [[Kurt Utzinger]], [[CCC]], June 16, 2009 <br />
* [https://talkchess.com/forum3/viewtopic.php?f=2&t=83004 ChessAssistant is also available on runway 24] by [[Frank Quisinsky]], [[CCC]], December 12, 2023<br />
<br />
=External Links=<br />
* [https://shop.chessok.com/index.php?main_page=product_info&cPath=7_54&products_id=979 Chess Assistant 24 - ChessOK.com]<br />
* [http://chessok.com/?page_id=19894 Chess Assistant - ChessOK.com]<br />
* [http://chessok.com/rolik/ca/content.html Chess Assistant - Video Tutorials]<br />
* [https://en.wikipedia.org/wiki/Chess_Assistant Chess Assistant from Wikipedia]<br />
<br />
=References= <br />
<references /><br />
'''[[GUI|Up one Level]]'''<br />
[[Category:Commercial]]<br />
[[Category:Database]]<br />
[[Category:ChessOK]]</div>Smatovic