Difference between revisions of "Parallel Search"

From Chessprogramming wiki
Jump to: navigation, search
(16 intermediate revisions by the same user not shown)
Line 14: Line 14:
 
''see Main page: [[Shared Hash Table]]''
 
''see Main page: [[Shared Hash Table]]''
  
This technique is a very simple approach to [[SMP]]. The implementation requires little more than starting additional processors. Processors are simply fed the root position at the beginning of the search, and each searches the same tree with the only communication being the [[transposition table]]. The gains come from the effect of nondeterminism. Each processor will finish the various subtrees in varying amounts of time, and as the search continues, these effects grow making the search trees diverge. The speedup is then based on how many nodes the main processor is able to skip from transposition table entries. Many programs used this if a "quick and dirty" approach to SMP is needed. It had the reputation of little speedup on a mere 2 processors, and to scale quite badly after this.
+
This technique is a very simple approach to [[SMP]]. The implementation requires little more than starting additional processors. Processors are simply fed the root position at the beginning of the search, and each searches the same tree with the only communication being the [[Transposition Table|transposition table]]. The gains come from the effect of nondeterminism. Each processor will finish the various subtrees in varying amounts of time, and as the search continues, these effects grow making the search trees diverge. The speedup is then based on how many nodes the main processor is able to skip from transposition table entries. Many programs used this if a "quick and dirty" approach to SMP is needed. It had the reputation of little speedup on a mere 2 processors, and to scale quite badly after this.
 
<span id="Lazy"></span>
 
<span id="Lazy"></span>
 
==Lazy SMP==  
 
==Lazy SMP==  
Line 266: Line 266:
 
'''1983'''
 
'''1983'''
 
* [[Raphael Finkel]], [[John Philip Fishburn]] ('''1983'''). ''Improved Speedup Bounds for Parallel Alpha-Beta Search''. [[IEEE#TPAMI|IEEE Transactions on Pattern Analysis and Machine Intelligence]], Vol. 5, No. 1, pp. 89 - 92
 
* [[Raphael Finkel]], [[John Philip Fishburn]] ('''1983'''). ''Improved Speedup Bounds for Parallel Alpha-Beta Search''. [[IEEE#TPAMI|IEEE Transactions on Pattern Analysis and Machine Intelligence]], Vol. 5, No. 1, pp. 89 - 92
* [[Clyde Kruskal]] ('''1983'''). ''[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1676138 Searching, Merging, and Sorting in Parallel Computation]''. [[IEEE#TOC|IEEE Transactions on Computers]]
+
* [[Clyde Kruskal]] ('''1983'''). ''[https://ieeexplore.ieee.org/document/1676138?arnumber=1676138 Searching, Merging, and Sorting in Parallel Computation]''. [[IEEE#TOC|IEEE Transactions on Computers]], Vol. C-32, No 10
 
* [[Gary Lindstrom]] ('''1983'''). ''The Key Node Method: A Highly-Parallel Alpha-Beta Algorithm''. Technical Report UUCCS 83-101, [https://en.wikipedia.org/wiki/University_of_Utah University of Utah], [http://content.lib.utah.edu/utils/getfile/collection/uspace/id/3706/filename/image pdf]
 
* [[Gary Lindstrom]] ('''1983'''). ''The Key Node Method: A Highly-Parallel Alpha-Beta Algorithm''. Technical Report UUCCS 83-101, [https://en.wikipedia.org/wiki/University_of_Utah University of Utah], [http://content.lib.utah.edu/utils/getfile/collection/uspace/id/3706/filename/image pdf]
 
'''1984'''
 
'''1984'''
* [[Lionel Moser]] ('''1984'''). ''An Experiment in Distributed Game Tree Searching'', M.Sc. thesis, [[University of Waterloo]] <ref>[[Jonathan Schaeffer]] ('''1985'''). ''[[Lionel Moser]]: An Experiment in Distributed Game Tree Searching.'' [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]] (Review)</ref> <ref>[http://www.cs.uwaterloo.ca/~alopez-o/divulge/chimp.html An Introduction to Computer Chess] by [http://www.cs.uwaterloo.ca/~alopez-o/ Alejandro López-Ortiz], 1993</ref>
+
* [[Lionel Moser]] ('''1984'''). ''An Experiment in Distributed Game Tree Searching'', M.Sc. thesis, [[University of Waterloo]] <ref>[[Jonathan Schaeffer]] ('''1985'''). ''[[Lionel Moser]]: An Experiment in Distributed Game Tree Searching.'' [[ICGA Journal#8_2|ICCA Journal, Vol. 8, No. 2]] (Review)</ref> <ref>[https://cs.uwaterloo.ca/~alopez-o/divulge/chimp.html An Introduction to Computer Chess] by [[Alejandro López-Ortiz]], 1993</ref>
 
==1985 ...==  
 
==1985 ...==  
* [[Tony Marsland]], [[Fred Popowich]] ('''1985'''). ''Parallel Game-Tree Search.'' [[IEEE#TPAMI|IEEE Transactions on Pattern Analysis and Machine Intelligence]], Vol. 7, No. 4, pp. 442-452. [http://webdocs.cs.ualberta.ca/~tony/OldPapers/parallel.pdf 1984 pdf] (Draft)
+
* [[Tony Marsland]], [[Fred Popowich]] ('''1985'''). ''Parallel Game-Tree Search.'' [[IEEE#TPAMI|IEEE Transactions on Pattern Analysis and Machine Intelligence]], Vol. 7, No. 4,[http://webdocs.cs.ualberta.ca/~tony/OldPapers/parallel.pdf 1984 pdf] (Draft)
 
* [[Baruch Awerbuch]] ('''1985'''). ''[http://www.sciencedirect.com/science/article/pii/0020019085900833 A New Distributed Depth-First Search Algorithm]''. [https://en.wikipedia.org/wiki/Information_Processing_Letters Information Processing Letters], Vol. 20, No. 3
 
* [[Baruch Awerbuch]] ('''1985'''). ''[http://www.sciencedirect.com/science/article/pii/0020019085900833 A New Distributed Depth-First Search Algorithm]''. [https://en.wikipedia.org/wiki/Information_Processing_Letters Information Processing Letters], Vol. 20, No. 3
 
* [[Monroe Newborn]] ('''1985'''). ''A Parallel Search Chess Program''. Proceedings of the ACM Annual Conference, pp. 272-277. Denver, Co.
 
* [[Monroe Newborn]] ('''1985'''). ''A Parallel Search Chess Program''. Proceedings of the ACM Annual Conference, pp. 272-277. Denver, Co.
* [[Clyde Kruskal]], [https://en.wikipedia.org/wiki/Alan_Weiss_%28mathematician%29 Alan Weiss] ('''1985'''). ''Allocating Independent Subtasks on Parallel Processors''. [https://en.wikipedia.org/wiki/IEEE_Transactions_on_Software_Engineering IEEE Transactions on Software Engineering], Vol. 11
+
* [[Clyde Kruskal]], [[Mathematician#AWeiss|Alan Weiss]] ('''1985'''). ''Allocating Independent Subtasks on Parallel Processors''. [[IEEE#SE|IEEE Transactions on Software Engineering]], Vol. 11, No. 10
 
* [http://www.informatik.uni-trier.de/~ley/pers/hd/l/Leifker:Daniel_B=.html Daniel B. Leifker], [[Laveen Kanal|Laveen N. Kanal]] ('''1985'''). ''[http://dl.acm.org/citation.cfm?id=1623687 A Hybrid SSS*/Alpha-Beta Algorithm for Parallel Search of Game Trees]''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai85.html#LeifkerK85 IJCAI'85] » [[SSS* and Dual*]]
 
* [http://www.informatik.uni-trier.de/~ley/pers/hd/l/Leifker:Daniel_B=.html Daniel B. Leifker], [[Laveen Kanal|Laveen N. Kanal]] ('''1985'''). ''[http://dl.acm.org/citation.cfm?id=1623687 A Hybrid SSS*/Alpha-Beta Algorithm for Parallel Search of Game Trees]''. [http://www.informatik.uni-trier.de/~ley/db/conf/ijcai/ijcai85.html#LeifkerK85 IJCAI'85] » [[SSS* and Dual*]]
 
'''1986'''
 
'''1986'''
Line 303: Line 303:
 
==1990 ...==  
 
==1990 ...==  
 
* [[Rainer Feldmann]], [[Peter Mysliwietz]], [[Burkhard Monien]] ('''1991''') ''A Fully Distributed Chess Program''. [[Advances in Computer Chess 6]], [http://www.top-5000.nl/ps/A%20fully%20distribuited%20chess%20program.pdf pdf]
 
* [[Rainer Feldmann]], [[Peter Mysliwietz]], [[Burkhard Monien]] ('''1991''') ''A Fully Distributed Chess Program''. [[Advances in Computer Chess 6]], [http://www.top-5000.nl/ps/A%20fully%20distribuited%20chess%20program.pdf pdf]
 +
* [[Clyde Kruskal]], [[Mathematician#LRudolph|Larry Rudolph]], [[Mathematician#MSnir|Marc Snir]] ('''1990'''). ''[https://www.sciencedirect.com/science/article/pii/030439759090192K A Complexity Theory of Efficient Parallel Algorithms]''. [https://en.wikipedia.org/wiki/Theoretical_Computer_Science_%28journal%29 Theoretical Computer Science], Vol. 71
 +
* [[Clyde Kruskal]], [[Mathematician#LRudolph|Larry Rudolph]], [[Mathematician#MSnir|Marc Snir]] ('''1990'''). ''[https://link.springer.com/article/10.1007/BF01840376 Efficient Parallel Algorithms for Graph Problems]''. [https://en.wikipedia.org/wiki/Algorithmica Algorithmica], Vol. 5, No. 1
 
'''1992'''
 
'''1992'''
 
* [[Jaleh Rezaie]], [[Raphael Finkel]] ('''1992'''). ''[https://www.researchgate.net/publication/2813087_A_comparison_of_some_parallel_game-tree_search_algorithms_Revised_version A comparison of some parallel game-tree search algorithms]''. [https://en.wikipedia.org/wiki/University_of_Kentucky University of Kentucky]
 
* [[Jaleh Rezaie]], [[Raphael Finkel]] ('''1992'''). ''[https://www.researchgate.net/publication/2813087_A_comparison_of_some_parallel_game-tree_search_algorithms_Revised_version A comparison of some parallel game-tree search algorithms]''. [https://en.wikipedia.org/wiki/University_of_Kentucky University of Kentucky]
Line 322: Line 324:
 
==1995 ...==  
 
==1995 ...==  
 
* [[Bradley Kuszmaul|Bradley C. Kuszmaul]] ('''1995'''). ''The StarTech Massively Parallel Chess Program''. [http://supertech.csail.mit.edu/papers/startech.pdf pdf]
 
* [[Bradley Kuszmaul|Bradley C. Kuszmaul]] ('''1995'''). ''The StarTech Massively Parallel Chess Program''. [http://supertech.csail.mit.edu/papers/startech.pdf pdf]
* [[Henri Bal]] and [[Victor Allis]] ('''1995'''). ''Parallel Retrograde Analysis on a Distributed System''. Supercomputing ’95, San Diego, CA.
+
* [[Henri Bal]], [[Victor Allis]] ('''1995'''). ''Parallel Retrograde Analysis on a Distributed System''. Supercomputing ’95, San Diego, CA.
 
* [[Jean-Christophe Weill]] ('''1995'''). ''Programmes d'Échecs de Championnat: Architecture Logicielle Synthèse de Fonctions d'Évaluations, Parallélisme de Recherche''. Ph.D. Thesis. Université Paris 8, Saint-Denis, [http://www.recherche.enac.fr/%7Eweill/publications/phdJCW.ps.gz zipped ps] (French)
 
* [[Jean-Christophe Weill]] ('''1995'''). ''Programmes d'Échecs de Championnat: Architecture Logicielle Synthèse de Fonctions d'Évaluations, Parallélisme de Recherche''. Ph.D. Thesis. Université Paris 8, Saint-Denis, [http://www.recherche.enac.fr/%7Eweill/publications/phdJCW.ps.gz zipped ps] (French)
 
* [[Holger Hopp]], [[Peter Sanders]] ('''1995'''). ''Parallel Game Tree Search on SIMD Machines''. [http://www.informatik.uni-trier.de/~ley/db/conf/irregular/irregular95.html#HoppS95 IRREGULAR 1995], from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.6343 CiteSeerX]
 
* [[Holger Hopp]], [[Peter Sanders]] ('''1995'''). ''Parallel Game Tree Search on SIMD Machines''. [http://www.informatik.uni-trier.de/~ley/db/conf/irregular/irregular95.html#HoppS95 IRREGULAR 1995], from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.6343 CiteSeerX]
* [[Tony Marsland]] and [[Yaoqing Gao]] ('''1995'''). ''Speculative Parallelism Improves Search?'' Technical Report 95-05, Department of Computing Science, [[University of Alberta]], [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.2187 CiteSeerX]
+
* [[Tony Marsland]], [[Yaoqing Gao]] ('''1995'''). ''Speculative Parallelism Improves Search?'' Technical Report 95-05, Department of Computing Science, [[University of Alberta]], [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.2187 CiteSeerX]
 
'''1996'''
 
'''1996'''
 
* [[Jean-Christophe Weill]] ('''1996'''). ''The ABDADA Distributed Minimax Search Agorithm''. Proceedings of the 1996 ACM Computer Science Conference, pp. 131-138. ACM, New York, N.Y, reprinted [[ICGA Journal#19_1|ICCA Journal, Vol. 19, No. 1]], [http://www.recherche.enac.fr/%7Eweill/publications/acm.ps.gz zipped postscript]
 
* [[Jean-Christophe Weill]] ('''1996'''). ''The ABDADA Distributed Minimax Search Agorithm''. Proceedings of the 1996 ACM Computer Science Conference, pp. 131-138. ACM, New York, N.Y, reprinted [[ICGA Journal#19_1|ICCA Journal, Vol. 19, No. 1]], [http://www.recherche.enac.fr/%7Eweill/publications/acm.ps.gz zipped postscript]
Line 338: Line 340:
 
* [[Andrew Tridgell]] ('''1997'''). ''KnightCap — a parallel chess program on the AP1000+''. [ftp://us6.samba.org/pub/tridge/knightcap_pcw97.ps.gz zipped ps] » [[KnightCap]]
 
* [[Andrew Tridgell]] ('''1997'''). ''KnightCap — a parallel chess program on the AP1000+''. [ftp://us6.samba.org/pub/tridge/knightcap_pcw97.ps.gz zipped ps] » [[KnightCap]]
 
* [[Mark Brockington]], [[Jonathan Schaeffer]] ('''1997'''). ''APHID Game-Tree Search''. [[Advances in Computer Chess 8]]
 
* [[Mark Brockington]], [[Jonathan Schaeffer]] ('''1997'''). ''APHID Game-Tree Search''. [[Advances in Computer Chess 8]]
* [[David Sturgill]] and [[Alberto Maria Segre]] ('''1997'''). ''[http://www.springerlink.com/content/j11186905500t384/ Nagging: A Distributed, Adversarial Search-Pruning Technique Applied to First-Order Inference]''. [https://en.wikipedia.org/wiki/Journal_of_Automated_Reasoning Journal of Automated Reasoning], Vol. 19, No. 3 <ref>[https://en.wikipedia.org/wiki/Nagging Nagging from Wikipedia]</ref>
+
* [[David Sturgill]], [[Alberto Maria Segre]] ('''1997'''). ''[http://www.springerlink.com/content/j11186905500t384/ Nagging: A Distributed, Adversarial Search-Pruning Technique Applied to First-Order Inference]''. [https://en.wikipedia.org/wiki/Journal_of_Automated_Reasoning Journal of Automated Reasoning], Vol. 19, No. 3 <ref>[https://en.wikipedia.org/wiki/Nagging Nagging from Wikipedia]</ref>
 
'''1998'''
 
'''1998'''
 
* [[Mark Brockington]] ('''1998'''). ''Asynchronous Parallel Game-Tree Search''. Ph.D. Thesis, [[University of Alberta]], [http://games.cs.ualberta.ca/articles/mgb_thesis.ps.gz zipped postscript]
 
* [[Mark Brockington]] ('''1998'''). ''Asynchronous Parallel Game-Tree Search''. Ph.D. Thesis, [[University of Alberta]], [http://games.cs.ualberta.ca/articles/mgb_thesis.ps.gz zipped postscript]
Line 352: Line 354:
 
'''2001'''
 
'''2001'''
 
* [[John Romein]] ('''2001'''). ''Multigame - An Environment for Distributed Game-Tree Search''. Ph.D. thesis, [https://en.wikipedia.org/wiki/Vrije_Universiteit Vrije Universiteit], supervisor [[Henri Bal]], [http://dare.ubvu.vu.nl/bitstream/1871/11305/1/5429.pdf pdf]
 
* [[John Romein]] ('''2001'''). ''Multigame - An Environment for Distributed Game-Tree Search''. Ph.D. thesis, [https://en.wikipedia.org/wiki/Vrije_Universiteit Vrije Universiteit], supervisor [[Henri Bal]], [http://dare.ubvu.vu.nl/bitstream/1871/11305/1/5429.pdf pdf]
* [[Yaron Shoham]], [[Sivan Toledo]] ('''2001'''). ''Parallel randomized best-first minimax search''. School of Computer Science, [https://en.wikipedia.org/wiki/Tel_Aviv_University Tel-Aviv University], [http://www.tau.ac.il/%7Estoledo/Pubs/rbf-ai-revised.pdf pdf]
 
 
* [[Valavan Manohararajah]] ('''2001'''). ''Parallel Alpha-Beta Search on Shared Memory Multiprocessors''. Masters Thesis, [http://www.top-5000.nl/ps/Parallel%20Alpha-Beta%20Search%20on%20Shared%20Memory%20Multiprocessors.pdf pdf]
 
* [[Valavan Manohararajah]] ('''2001'''). ''Parallel Alpha-Beta Search on Shared Memory Multiprocessors''. Masters Thesis, [http://www.top-5000.nl/ps/Parallel%20Alpha-Beta%20Search%20on%20Shared%20Memory%20Multiprocessors.pdf pdf]
 
* [http://www.zib.de/schintke/ Florian Schintke], [http://pc2.uni-paderborn.de/people/jens-simon/ Jens Simon], [[Alexander Reinefeld]] ('''2001'''). ''A Cache Simulator for Shared Memory Systems''. International Conference on Computational Science ICCS 2001, San Francisco, CA, Springer LNCS 2074, vol. 2, pp. 569-578. [http://www.zib.de/reinefeld/Publications/ldasim-lncs.ps.gz zipped ps]
 
* [http://www.zib.de/schintke/ Florian Schintke], [http://pc2.uni-paderborn.de/people/jens-simon/ Jens Simon], [[Alexander Reinefeld]] ('''2001'''). ''A Cache Simulator for Shared Memory Systems''. International Conference on Computational Science ICCS 2001, San Francisco, CA, Springer LNCS 2074, vol. 2, pp. 569-578. [http://www.zib.de/reinefeld/Publications/ldasim-lncs.ps.gz zipped ps]
 
'''2002'''
 
'''2002'''
 +
* [[Yaron Shoham]], [[Sivan Toledo]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S0004370202001959 Parallel Randomized Best-First Minimax Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_(journal) Artificial Intelligence], Vol. 137, Nos. 1-2
 
* [[Akihiro Kishimoto]], [[Jonathan Schaeffer]]. ('''2002'''). ''Distributed Game-Tree Search Using Transposition Table Driven Work Scheduling'', In Proc. of 31st International Conference on Parallel Processing (ICPP'02), pages 323-330, IEEE Computer Society Press. [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.132.8604&rep=rep1&type=pdf pdf] via [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.8604 CiteSeerX]
 
* [[Akihiro Kishimoto]], [[Jonathan Schaeffer]]. ('''2002'''). ''Distributed Game-Tree Search Using Transposition Table Driven Work Scheduling'', In Proc. of 31st International Conference on Parallel Processing (ICPP'02), pages 323-330, IEEE Computer Society Press. [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.132.8604&rep=rep1&type=pdf pdf] via [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.132.8604 CiteSeerX]
 
* [[Akihiro Kishimoto]], [[Jonathan Schaeffer]]. ('''2002'''). ''Transposition Table Driven Work Scheduling in Distributed Game-Tree Search'' (Best Paper Prize), In Proc. of Fifteenth Canadian Conference on Artificial Intelligence (AI'2002), volume 2338 of Lecture Notes in Artificial Intelligence (LNAI), pages 56-68, [http://www.springerlink.com/content/47b3crn04egmmx8l/ Springer]
 
* [[Akihiro Kishimoto]], [[Jonathan Schaeffer]]. ('''2002'''). ''Transposition Table Driven Work Scheduling in Distributed Game-Tree Search'' (Best Paper Prize), In Proc. of Fifteenth Canadian Conference on Artificial Intelligence (AI'2002), volume 2338 of Lecture Notes in Artificial Intelligence (LNAI), pages 56-68, [http://www.springerlink.com/content/47b3crn04egmmx8l/ Springer]
 
* [[John Romein]], [[Henri Bal]], [[Jonathan Schaeffer]], [[Aske Plaat]] ('''2002'''). ''A Performance Analysis of Transposition-Table-Driven Scheduling in Distributed Search''. IEEE Transactions on Parallel and Distributed Systems, Vol. 13, No. 5, pp. 447–459. [http://www.cs.vu.nl/~bal/Papers/tds.pdf pdf] » [[Transposition Table]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47700 Transposition driven scheduling] by [[Daniel Shawul]], [[CCC]], April 04, 2013</ref>
 
* [[John Romein]], [[Henri Bal]], [[Jonathan Schaeffer]], [[Aske Plaat]] ('''2002'''). ''A Performance Analysis of Transposition-Table-Driven Scheduling in Distributed Search''. IEEE Transactions on Parallel and Distributed Systems, Vol. 13, No. 5, pp. 447–459. [http://www.cs.vu.nl/~bal/Papers/tds.pdf pdf] » [[Transposition Table]] <ref>[http://www.talkchess.com/forum/viewtopic.php?t=47700 Transposition driven scheduling] by [[Daniel Shawul]], [[CCC]], April 04, 2013</ref>
* [[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence] 140, [http://jmvidal.cse.sc.edu/library/segre02a.pdf pdf], [http://compepi.cs.uiowa.edu/uploads/Profiles/Segre/nag.pdf pdf]
+
* [[Alberto Maria Segre]], [[Sean Forman]], [[Giovanni Resta]], [[Andrew Wildenberg]] ('''2002'''). ''[https://www.sciencedirect.com/science/article/pii/S000437020200228X Nagging: A Scalable Fault-Tolerant Paradigm for Distributed Search]''. [https://en.wikipedia.org/wiki/Artificial_Intelligence_%28journal%29 Artificial Intelligence], Vol. 140, Nos. 1-2
 
'''2003'''
 
'''2003'''
 
* [[Brian Greskamp]] ('''2003'''). ''Parallelizing a Simple Chess Program''. [http://iacoma.cs.uiuc.edu/~greskamp/pdfs/412.pdf pdf]
 
* [[Brian Greskamp]] ('''2003'''). ''Parallelizing a Simple Chess Program''. [http://iacoma.cs.uiuc.edu/~greskamp/pdfs/412.pdf pdf]
Line 378: Line 380:
 
* [[Kai Himstedt]], [[Ulf Lorenz]], [http://www.informatik.uni-hamburg.de/TIS/index.php Dietmar P. F. Möller] ('''2008'''). ''A Twofold Distributed Game-Tree Search Approach Using Interconnected Clusters''. Euro-Par 2008: 587-598, [http://www.springerlink.com/content/2471845u5w6j1211/ abstract] from [http://www.springerlink.com/home/main.mpx springerlink]
 
* [[Kai Himstedt]], [[Ulf Lorenz]], [http://www.informatik.uni-hamburg.de/TIS/index.php Dietmar P. F. Möller] ('''2008'''). ''A Twofold Distributed Game-Tree Search Approach Using Interconnected Clusters''. Euro-Par 2008: 587-598, [http://www.springerlink.com/content/2471845u5w6j1211/ abstract] from [http://www.springerlink.com/home/main.mpx springerlink]
 
* [[Tristan Cazenave]], [[Nicolas Jouandeau]] ('''2008'''). ''A Parallel Monte-Carlo Tree Search Algorithm''. [http://www.lamsade.dauphine.fr/~cazenave/papers/parallelMCTS.pdf pdf]
 
* [[Tristan Cazenave]], [[Nicolas Jouandeau]] ('''2008'''). ''A Parallel Monte-Carlo Tree Search Algorithm''. [http://www.lamsade.dauphine.fr/~cazenave/papers/parallelMCTS.pdf pdf]
* [[James Swafford]] ('''2008'''). ''A Survey of Parallel Search Algorithms over Alpha-Beta Search Trees using Symmetric Multiprocessor Machines''. Masters Project, [https://en.wikipedia.org/wiki/East_Carolina_University East Carolina University], advisor [http://www.cs.ecu.edu/rws/ Ronnie Smith]
+
* [[James Swafford]] ('''2008'''). ''[https://fr.slideshare.net/JamesSwafford2/jfsmasters1 A Survey of Parallel Search Algorithms over Alpha-Beta Search Trees using Symmetric Multiprocessor Machines]''. Masters Project, [https://en.wikipedia.org/wiki/East_Carolina_University East Carolina University], advisor [http://www.cs.ecu.edu/rws/ Ronnie Smith]
 
'''2009'''
 
'''2009'''
 
* [[Markus Enzenberger]], [[Martin Müller]] ('''2009'''). ''A lock-free multithreaded Monte-Carlo tree search algorithm'', [[Advances in Computer Games 12]], [http://webdocs.cs.ualberta.ca/~mmueller/ps/enzenberger-mueller-acg12.pdf pdf]
 
* [[Markus Enzenberger]], [[Martin Müller]] ('''2009'''). ''A lock-free multithreaded Monte-Carlo tree search algorithm'', [[Advances in Computer Games 12]], [http://webdocs.cs.ualberta.ca/~mmueller/ps/enzenberger-mueller-acg12.pdf pdf]
Line 414: Line 416:
 
* [[Shu Yokoyama]], [[Tomoyuki Kaneko]], [[Tetsuro Tanaka]] ('''2015'''). ''Parameter-Free Tree Style Pipeline in Asynchronous Parallel Game-Tree Search''. [[Advances in Computer Games 14]]
 
* [[Shu Yokoyama]], [[Tomoyuki Kaneko]], [[Tetsuro Tanaka]] ('''2015'''). ''Parameter-Free Tree Style Pipeline in Asynchronous Parallel Game-Tree Search''. [[Advances in Computer Games 14]]
 
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2015'''). ''Scaling Monte Carlo Tree Search on Intel Xeon Phi''. [http://arxiv.org/abs/1507.04383 CoRR abs/1507.04383] » [[Hex]], [[Monte-Carlo Tree Search|MCTS]], [[x86-64]]
 
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2015'''). ''Scaling Monte Carlo Tree Search on Intel Xeon Phi''. [http://arxiv.org/abs/1507.04383 CoRR abs/1507.04383] » [[Hex]], [[Monte-Carlo Tree Search|MCTS]], [[x86-64]]
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2015'''). ''Parallel Monte Carlo Tree Search from Multi-core to Many-core Processors''. [https://whova.com/portal/ieeet_201508/ TrustCom/BigDataSE/|ISPA 2015], [https://askeplaat.files.wordpress.com/2013/01/ispa2015.pdf pdf]  
+
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2015'''). ''Parallel Monte Carlo Tree Search from Multi-core to Many-core Processors''. [https://whova.com/portal/ieeet_201508/ TrustCom/BigDataSE/ISPA 2015], [https://askeplaat.files.wordpress.com/2013/01/ispa2015.pdf pdf]  
 
* [[Ting-Han Wei]], [[Chao-Chin Liang]], [[I-Chen Wu]], [[Lung-Pin Chen]] ('''2015'''). ''Software Development Framework for Job-Level Algorithms''. [[ICGA Journal#38_3|ICGA Journal, Vol. 38, No. 3]]
 
* [[Ting-Han Wei]], [[Chao-Chin Liang]], [[I-Chen Wu]], [[Lung-Pin Chen]] ('''2015'''). ''Software Development Framework for Job-Level Algorithms''. [[ICGA Journal#38_3|ICGA Journal, Vol. 38, No. 3]]
 
* [[Akira Ura]], [[Yoshimasa Tsuruoka]], [[Takashi Chikayama]] ('''2015'''). ''[https://www.jstage.jst.go.jp/article/ipsjjip/23/1/23_9/_article Dynamic Prediction of Minimal Trees in Large-Scale Parallel Game Tree Search]''. [https://www.jstage.jst.go.jp/browse/ipsjjip/ Journal of Information Processing], Vol. 23, No. 1
 
* [[Akira Ura]], [[Yoshimasa Tsuruoka]], [[Takashi Chikayama]] ('''2015'''). ''[https://www.jstage.jst.go.jp/article/ipsjjip/23/1/23_9/_article Dynamic Prediction of Minimal Trees in Large-Scale Parallel Game Tree Search]''. [https://www.jstage.jst.go.jp/browse/ipsjjip/ Journal of Information Processing], Vol. 23, No. 1
Line 420: Line 422:
 
* [[Jr-Chang Chen]], [[I-Chen Wu]], [[Wen-Jie Tseng]], [[Bo-Han Lin]], [[Chia-Hui Chang]] ('''2015'''). ''[https://ir.nctu.edu.tw/handle/11536/124541 Job-Level Alpha-Beta Search]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 7, No. 1
 
* [[Jr-Chang Chen]], [[I-Chen Wu]], [[Wen-Jie Tseng]], [[Bo-Han Lin]], [[Chia-Hui Chang]] ('''2015'''). ''[https://ir.nctu.edu.tw/handle/11536/124541 Job-Level Alpha-Beta Search]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 7, No. 1
 
* [[Lars Schaefers]], [[Marco Platzner]] ('''2015'''). ''[http://ieeexplore.ieee.org/document/6876158/ Distributed Monte Carlo Tree Search: A Novel Technique and its Application to Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 7, No. 4 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66125&start=18 Re: Minmax backup operator for MCTS] by [[Brahim Hamadicharef]], [[CCC]], December 30, 2017</ref>
 
* [[Lars Schaefers]], [[Marco Platzner]] ('''2015'''). ''[http://ieeexplore.ieee.org/document/6876158/ Distributed Monte Carlo Tree Search: A Novel Technique and its Application to Computer Go]''. [[IEEE#TOCIAIGAMES|IEEE Transactions on Computational Intelligence and AI in Games]], Vol. 7, No. 4 <ref>[http://www.talkchess.com/forum/viewtopic.php?t=66125&start=18 Re: Minmax backup operator for MCTS] by [[Brahim Hamadicharef]], [[CCC]], December 30, 2017</ref>
 
+
'''2016'''
 +
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2016'''). ''A New Method for Parallel Monte Carlo Tree Search''. [https://arxiv.org/abs/1605.04447 arXiv:1605.04447] » [[Monte-Carlo Tree Search|MCTS]]
 +
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2016'''). ''An Efficient Computation Pattern for Parallel MCTS''. [http://www.ictopen.nl/content/Previous+editions ICT.OPEN 2016], [http://liacs.leidenuniv.nl/~plaata1/papers/ictopen2016.pdf pdf]
 +
'''2017'''
 +
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2017'''). ''Structured Parallel Programming for Monte Carlo Tree Search''. [https://arxiv.org/abs/1704.00325 arXiv:1704.00325]
 +
* [[S. Ali Mirsoleimani]], [[Aske Plaat]], [[Jaap van den Herik]], [[Jos Vermaseren]] ('''2017'''). ''An Analysis of Virtual Loss in Parallel MCTS''. [https://dblp.uni-trier.de/db/conf/icaart/icaart2017-2.html ICAART 2017], Vol. 2, [http://liacs.leidenuniv.nl/~plaata1/papers/paper_ICAART17.pdf pdf]
 +
'''2018'''
 +
* [[S. Ali Mirsoleimani]], [[Jaap van den Herik]], [[Aske Plaat]], [[Jos Vermaseren]] ('''2018'''). ''Pipeline Pattern for Parallel MCTS''. [https://dblp.uni-trier.de/db/conf/icaart/icaart2018-2.html ICAART 2018], [http://liacs.leidenuniv.nl/~plaata1/papers/paper_ICAART18_pos.pdf pdf]
 +
* [[S. Ali Mirsoleimani]], [[Jaap van den Herik]], [[Aske Plaat]], [[Jos Vermaseren]] ('''2018'''). ''A Lock-free Algorithm for Parallel MCTS''. [https://dblp.uni-trier.de/db/conf/icaart/icaart2018-2.html ICAART 2018], [http://liacs.leidenuniv.nl/~plaata1/papers/paper_ICAART18.pdf pdf]
  
 
=Forum Posts=  
 
=Forum Posts=  
Line 504: Line 514:
 
* [http://www.talkchess.com/forum/viewtopic.php?t=48524 Measure of SMP scalability] by [[Edsel Apostol]], [[CCC]], July 03, 2013 » [[Hannibal]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=48524 Measure of SMP scalability] by [[Edsel Apostol]], [[CCC]], July 03, 2013 » [[Hannibal]]
 
: [http://www.talkchess.com/forum/viewtopic.php?t=48591 Measure of SMP scalability (sub-thread)] by [[Ernest Bonnem]], [[CCC]], July 08, 2013
 
: [http://www.talkchess.com/forum/viewtopic.php?t=48591 Measure of SMP scalability (sub-thread)] by [[Ernest Bonnem]], [[CCC]], July 08, 2013
* [http://www.talkchess.com/forum/viewtopic.php?t=48536 Lazy SMP and Work Sharing] by [[Daniel Homan]], [[CCC]], July 03, 2013 » [[EXchess#LazySMP|Lazy SMP]] in [[EXChess]]
+
* [http://www.talkchess.com/forum/viewtopic.php?t=48536 Lazy SMP and Work Sharing] by [[Daniel Homan]], [[CCC]], July 03, 2013 » [[EXchess#LazySMP|Lazy SMP]] in [[EXchess]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=48612 use sleeping threads] by [[Don Dailey]], [[CCC]], July 10, 2013 » [[Stockfish]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=48612 use sleeping threads] by [[Don Dailey]], [[CCC]], July 10, 2013 » [[Stockfish]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=48752 Recursive DTS-like search algorithm] by [[Peter Österlund]], [[CCC]], July 24, 2013 » [[Texel]], [[Recursion]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=48752 Recursive DTS-like search algorithm] by [[Peter Österlund]], [[CCC]], July 24, 2013 » [[Texel]], [[Recursion]]
Line 563: Line 573:
 
* [http://www.talkchess.com/forum/viewtopic.php?t=58946 Parallel Search to Fixed Depth] by [[Brian Richardson]], [[CCC]], January 17, 2016
 
* [http://www.talkchess.com/forum/viewtopic.php?t=58946 Parallel Search to Fixed Depth] by [[Brian Richardson]], [[CCC]], January 17, 2016
 
* [http://www.talkchess.com/forum/viewtopic.php?t=58950 Threads test incl. Komodo 9.3] by [[Andreas Strangmüller]], [[CCC]], January 17, 2016 » [[Komodo]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=58950 Threads test incl. Komodo 9.3] by [[Andreas Strangmüller]], [[CCC]], January 17, 2016 » [[Komodo]]
* [http://www.talkchess.com/forum/viewtopic.php?t=59389 Lazy SMP - how it works] by Kalyankumar Ramaseshan, [[CCC]], February 29, 2016 » [[Lazy SMP]]
+
* [http://www.talkchess.com/forum/viewtopic.php?t=59389 Lazy SMP - how it works] by [[Kalyankumar Ramaseshan]], [[CCC]], February 29, 2016 » [[Lazy SMP]]
 
* [http://talkchess.com/forum/viewtopic.php?t=59745 Crafty SMP measurement] by [[Robert Hyatt]], [[CCC]], April 04, 2016 » [[Crafty]]
 
* [http://talkchess.com/forum/viewtopic.php?t=59745 Crafty SMP measurement] by [[Robert Hyatt]], [[CCC]], April 04, 2016 » [[Crafty]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=60271 parallel search speed measurement] by [[Robert Hyatt]], [[CCC]],  May 24, 2016
 
* [http://www.talkchess.com/forum/viewtopic.php?t=60271 parallel search speed measurement] by [[Robert Hyatt]], [[CCC]],  May 24, 2016
Line 587: Line 597:
 
* [http://www.talkchess.com/forum/viewtopic.php?t=66044 Parallel search/LazySMP question] by [[Jon Dart]], [[CCC]], December 17, 2017 » [[Lazy SMP]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=66044 Parallel search/LazySMP question] by [[Jon Dart]], [[CCC]], December 17, 2017 » [[Lazy SMP]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=66099 Time Managment translating to SMP] by [[Andrew Grant]], [[CCC]], December 23, 2017  » [[Time Management]]
 
* [http://www.talkchess.com/forum/viewtopic.php?t=66099 Time Managment translating to SMP] by [[Andrew Grant]], [[CCC]], December 23, 2017  » [[Time Management]]
 +
'''2018'''
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=68154 Lazy SMP and 44 cores] by [[Sander Maassen vd Brink]], [[CCC]], August 07, 2018 » [[Lazy SMP]]
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=68278 Lazy SMP ideas]  by [[Edsel Apostol]], [[CCC]], August 22, 2018
 +
'''2019'''
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=69481 Simplest way to implement quick and dirty lazy smp] by [[Tom King]], [[CCC]], January 04, 2019
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71115 OpenMP for chess programming] by BeyondCritics, [[CCC]], June 27, 2019
 +
* [http://www.talkchess.com/forum3/viewtopic.php?f=7&t=71739 Multithreaded noob question] by [[Michael Sherwin]], [[CCC]], September 06, 2019
  
 
=External Links=  
 
=External Links=  

Revision as of 21:39, 28 December 2020

Home * Search * Parallel Search

Parallel scalability [1]

Parallel Search,
also known as Multithreaded Search or SMP Search, is a way to increase search speed by using additional processors. This topic that has been gaining popularity recently with multiprocessor computers becoming widely available. Utilizing these additional processors is an interesting domain of research, as traversing a search tree is inherently serial. Several approaches have been devised, with the most popular today being Young Brothers Wait Concept and Shared Hash Table aka Lazy SMP.

This page gives a brief summary of the different types. SMP algorithms are classified by their scalability (trend in search speed as the number of processors becomes large) and their speedup (change in time to complete a search). Typically, programmers use scaling to mean change in nodes per second (NPS) rates, and speedup to mean change in time to depth. Scaling and scalability are thus two different concepts.

Distributed Search

A subtype of parallel algorithms, distributed algorithms are algorithms designed to work in cluster computing and distributed computing environments, where additional concerns beyond the scope of "classical" parallel algorithms need to be addressed.

Shared Hash Table

see Main page: Shared Hash Table

This technique is a very simple approach to SMP. The implementation requires little more than starting additional processors. Processors are simply fed the root position at the beginning of the search, and each searches the same tree with the only communication being the transposition table. The gains come from the effect of nondeterminism. Each processor will finish the various subtrees in varying amounts of time, and as the search continues, these effects grow making the search trees diverge. The speedup is then based on how many nodes the main processor is able to skip from transposition table entries. Many programs used this if a "quick and dirty" approach to SMP is needed. It had the reputation of little speedup on a mere 2 processors, and to scale quite badly after this.

Lazy SMP

see Main page: Lazy SMP

Recent improvements by Daniel Homan [2] , Martin Sedlak [3] and others on Lazy SMP indicate that the algorithm scales quite well up to 8 cores and beyond [4] .

ABDADA

see Main page: ABDADA

ABDADA, Alpha-Bêta Distribué avec Droit d'Aînesse (Distributed Alpha-Beta Search with Eldest Son Right) is a loosely synchronized, distributed search algorithm by Jean-Christophe Weill [5] . It is based on the Shared Hash Table, and adds the number of processors searching this node inside the hash-table entry for better utilization - considering the Young Brothers Wait Concept.

Parallel Alpha-Beta

These algorithms divide the Alpha-Beta tree, giving different subtrees to different processors. Because alpha-beta is a serial algorithm, this approach is much more complex. However, these techniques also provide for the greatest gains from additional processors.

Principal Variation Splitting (PVS)

In Principal Variation Splitting (PVS), each node is expressed by a thread. A thread will spawn one child thread for each legal move. But data dependency specified by the algorithm exists among these threads: After getting a tighter bound from the thread corresponding to the PV node, the remaining threads are ready to run.

PVSplit.JPG

PV Splitting [6]

YBWC and Jamboree

The idea in Feldmann's Young Brothers Wait Concept (YBWC) [7] [8] as well in Kuszmaul's Jamboree Search [9] [10] [11] , is to search the first sibling node first before spawning the remaining siblings in parallel. This is based on the observations that the first move is either going to produce a cutoff (in which case processing sibling nodes is wasted effort) or return much better bounds. If the first move does not produce a cut-off, then the remaining moves are searched in parallel. This process is recursive.

Since the number of processors is not infinite the process of "spawning" work normally consists in putting it on some kind of "work to be done stack" where processors are free to grab work in FIFO fashion when there is no work to do. In YBW you would not "spawn" or place work on the stack until the first sibling is searched.

In their 1983 paper Improved Speedup Bounds for Parallel Alpha-Beta Search [12] , Raphael Finkel and John Philip Fishburn already gave the theoretical confirmation to the common sense wisdom that parallel resources should first be thrown into searching the first child. Assuming the tree is already in an approximation to best-first order, this establishes a good alpha value that can then be used to parallel search the later children. The algorithm in the 1982 Artificial Intelligence paper [13] , which Fishburn called the "dumb algorithm" in his 1981 thesis presentation [14] gives p^0.5 speedup with p processors, while the 1983 PAMI algorithm (the "smart algorithm") gives p^0.8 speedup for lookahead trees with the branching factor of chess.

Dynamic Tree Splitting (DTS)

Main page: Dynamic Tree Splitting

This algorithm, invented by the Cray Blitz team (including Robert Hyatt [15] ), is the most complex. Though this gives the best known scalability for any SMP algorithm, there are very few programs using it because of its difficulty of implementation.

Other Approaches

Many different approaches have been tried that do not directly split the search tree. These algorithms have not enjoyed popular success due to the fact that they are not scalable. Typical examples include one processor that evaluates positions fed to it by a searching processor, or a tactical search that confirms or refutes a positional search.

Taxonomy

Overview and taxonomy of parallel algorithms based on alpha-beta, given by Mark Brockington, ICCA Journal, Vol. 19: No. 3 in 1996 [16]

First
Described
Algorithm Author(s) Processor Hierarchy /
Control Distribution
Parallelism Possible
At These Nodes
Synchronisation Done
At These Nodes
1978 Parallel Aspiration Search Gérard M. Baudet Static/Centralized Root (αβ - Window) Root
1979 Mandatory Work First Selim Akl et al. Static/Centralized Type-1+3+

Leftmost child of 3

Bad Type-2
1980 Tree Splitting Raphael Finkel,
John Philip Fishburn
Static/Centralized Top k-ply Root
1981 PVS Tony Marsland,
Murray Campbell
Static/Centralized Type-1 Type-1
1983 Key Node Gary Lindstrom Static/Centralized Type-1+3+
Leftmost child of 3
Bad Type-2
1986 UIDPABS [17] Monroe Newborn Static/Centralized Root None
1987 DPVS Jonathan Schaeffer Dynamic/Centralized Type-1+3+2 Type 1+3+Bad 2
EPVS Robert Hyatt,
Bruce W. Suter,
Harry Nelson
Dynamic/Centralized Type-1+3 Type-1+3
Waycool Ed Felten,
Steve Otto
Dynamic/Distributed All, except Type-2
with no TT-Entry
Nodes with TT
& no cutoff
YBWC Rainer Feldmann Dynamic/Distributed Type-1+3+Bad 2 Type-1+Bad 2
1988 Dynamic Tree Splitting Robert Hyatt Dynamic/Distributed Type-1+3+Bad 2 Type-1+Bad 2
Bound-and-Branch Chris Ferguson,
Richard Korf
Dynamic/Distributed Type-1+3+Bad 2 Type-1+Bad 2
1990 Delayed Branch
Tree Expansion
Feng-hsiung Hsu Static/Centralized Type-1+3 Bad Type-2
1993 Frontier Splitting Paul Lu Dynamic/Distributed All Root
αβ* Vincent David Dynamic/Distributed Type-1+3 Type-1+3+Bad 2
1994 CABP [18] Van-Dat Cung Static/Centralized Type-1+3 Bad Type-2
Jamboree Bradley Kuszmaul Dynamic/Distributed Type-1+3+Bad 2 Type-1+Bad 2
1995 ABDADA Jean-Christophe Weill Dynamic/Distributed Type-1+3+Bad 2 Type-1+Bad 2
Dynamic Multiple PV-Split Tony Marsland,
Yaoqing Gao
Dynamic/Distributed Nodes within PV set Nodes within PV set
1996 APHID Mark Brockington,
Jonathan Schaeffer
Static/Centralized Top k-ply None

Other Considerations

Memory Design

Semaphores

During an parallel search, certain areas of memory must be protected to make sure processors do not write simultaneously and corrupt the data. Some type of semaphore system must be used. Semaphores access a piece of shared memory, typically an integer. When a processor wants to access protected memory, it reads the integer. If it is zero, meaning no other process is accessing the memory, then the processor attempts to make the integer nonzero. This whole process must be done atomically, meaning that the read, compare, and write are all done at the same time. Without this atomicity another processor could read the integer at the same time and both would see that they can freely access the memory.

In chess programs that use parallel alpha-beta, usually spinlocks are used. Because the semaphores are held for such short periods of time, processors want to waste as little time as possible after the semaphore is released before acquiring access. To do this, if the semaphore is held when a processor reaches it, the processor continuously reads the semaphore. This technique can waste a lot of processing time in applications with high contention, meaning that many processes try to access the same semaphore simultaneously. In chess programs, however, we want as much processing power as possible.

Spinlocks are sometimes implemented in assembly language because the operating system does not have an API for them.

Threads vs. Processes

There are two ways of utilizing the extra processing power of multiple CPUs, threads and processes. The difference between them is that threads share all memory in the program, but there are multiple threads of execution. In processes, all memory is local to each processor except memory that is explicitly shared. This means that in a threaded program, functions must pass around an extra argument specifying which thread is active, thus which board structure to use. When using processes, a single global board can be used that will be duplicated for each process.

Threads are more common, because they are easier to debug as well as implement, provided the program does not already have lots of global variables. Processes are favored by some because the need to explicitly share memory makes subtle bugs easier to avoid. Also, in processes, the extra argument to most functions is not needed.

Some programs that use threads:

Some programs that use processes:

Didactic Programs

See also

Publications

1950 ...

1970 ...

1980 ...

1981

  • John Philip Fishburn (1981). Analysis of Speedup in Distributed Algorithms. Ph.D. Thesis, pdf
  • Selim Akl, R.J. Doran (1981). A comparison of parallel implementations of the alpha-beta and Scout tree search algorithms using the game of checkers, Department of Computing and Information Science, Queen's University, Kingston, Ontario.
  • Tony Marsland, Murray Campbell (1981). Parallel Search of Strongly Ordered Game Trees. Technical Report TR 81-9, Department of Computing Science , University of Alberta, pdf

1982

1983

1984

1985 ...

1986

1987

1988

1989

1990 ...

1992

1993

1994

1995 ...

1996

1997

1998

1999

2000 ...

2001

2002

2003

2004

2005 ...

  • Jan Renze Steenhuisen (2005). Transposition-Driven Scheduling in Parallel Two-Player State-Space Search. Masters Thesis, pdf

2006

2007

2008

2009

2010 ...

2011

2012

2013

2014

2015 ...

2016

2017

2018

Forum Posts

1995 ...

2000 ...

2005 ...

2008

2009

Results from UCT parallelization by Gian-Carlo Pascutto, CCC, March 11, 2009

2010 ...

2011

2012

2013

Measure of SMP scalability (sub-thread) by Ernest Bonnem, CCC, July 08, 2013

2014

2015 ...

Explanation for non-expert? by Louis Zulli, CCC, February 16, 2015 » Stockfish
Best Stockfish NPS scaling yet by Louis Zulli, CCC, March 02, 2015

2016

Re: Baffling multithreading scaling behavior by Robert Hyatt, CCC, September 07, 2016

2017

Approximate ABDADA by Peter Österlund, CCC, August 23, 2017 » ABDADA

2018

2019

External Links

Parallel Search

Parallel Computing

Scalability

Shared Memory

Shared Memory:

False sharing from Wikipedia

Cache

Cache:

MSI protocol from Wikipedia
MESI protocol from Wikipedia
MOESI protocol from Wikipedia
assembly - The prefetch instruction - Stack Overflow
Data Prefetch Support - GNU Project - Free Software Foundation (FSF)
Software prefetching considered harmful by Linus Torvalds, LWN.net, May 19, 2011

Concurrency and Synchronization

Synchronization:

Cooperating sequential processes (EWD 123)
A challenge to memory designers? (EWD 497)

Misc

References

  1. Super Linearity and the Bigger Machine from Dr. Dobb's
  2. Lazy SMP, part 2 by Daniel Homan, CCC, January 12, 2013
  3. Lazy SMP in Cheng by Martin Sedlak, CCC, February 02, 2015
  4. Re: A new chess engine : m8 (comming not so soon) by Peter Österlund, CCC, February 01, 2015
  5. Jean-Christophe Weill (1996). The ABDADA Distributed Minimax Search Agorithm. Proceedings of the 1996 ACM Computer Science Conference, pp. 131-138. ACM, New York, N.Y, reprinted ICCA Journal, Vol. 19, No. 1, zipped postscript
  6. Yaoqing Gao, Tony Marsland (1996). Multithreaded Pruned Tree Search in Distributed Systems. Journal of Computing and Information, 2(1), 482-492, pdf
  7. Rainer Feldmann, Peter Mysliwietz, Burkhard Monien (1991). A Fully Distributed Chess Program. Advances in Computer Chess 6, pdf
  8. Rainer Feldmann (1993). Game Tree Search on Massively Parallel Systems. Ph.D. Thesis, pdf
  9. Christopher F. Joerg, Bradley C. Kuszmaul (1994). Massively Parallel Chess, pdf
  10. Bradley C. Kuszmaul (1994). Synchronized MIMD Computing. Ph. D. Thesis, Massachusetts Institute of Technology, pdf
  11. Bradley C. Kuszmaul (1995). The StarTech Massively Parallel Chess Program. pdf
  12. Raphael Finkel, John Philip Fishburn (1983). Improved Speedup Bounds for Parallel Alpha-Beta Search. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 5, No. 1, pp. 89 - 92
  13. Raphael Finkel, John Philip Fishburn (1982). Parallelism in Alpha-Beta Search. Artificial Intelligence, Vol. 19, No. 1
  14. John Philip Fishburn (1981). Analysis of Speedup in Distributed Algorithms. Ph.D. Thesis, pdf
  15. Robert Hyatt (1994). The DTS high-performance parallel tree search algorithm
  16. Mark Brockington (1996). A Taxonomy of Parallel Game-Tree Search Algorithms. ICCA Journal, Vol. 19: No. 3
  17. UIDPABS = Unsynchronized Iteratively Deepening Parallel Alpha-Beta Search
  18. CABP = Concurrent Alpha-Beta Pruning
  19. It should also be noted that Crafty uses threads on Windows, and used processes on Unix
  20. threads vs processes again by Robert Hyatt, CCC, February 27, 2006
  21. Jonathan Schaeffer (1985). Lionel Moser: An Experiment in Distributed Game Tree Searching. ICCA Journal, Vol. 8, No. 2 (Review)
  22. An Introduction to Computer Chess by Alejandro López-Ortiz, 1993
  23. Nagging from Wikipedia
  24. Re: scorpio can run on 8192 cores by Daniel Shawul, CCC, August 29, 2015
  25. Transposition-driven scheduling - Wikipedia
  26. Transposition driven scheduling by Daniel Shawul, CCC, April 04, 2013
  27. Dovetailing (computer science) from Wikipedia
  28. Georg Hager's Blog | Random thoughts on High Performance Computing
  29. Re: Minmax backup operator for MCTS by Brahim Hamadicharef, CCC, December 30, 2017
  30. Message Passing Interface from Wikipedia
  31. John Romein, Henri Bal, Jonathan Schaeffer, Aske Plaat (2002). A Performance Analysis of Transposition-Table-Driven Scheduling in Distributed Search. IEEE Transactions on Parallel and Distributed Systems, Vol. 13, No. 5, pp. 447–459. pdf
  32. Message Passing Interface from Wikipedia
  33. std::async - cppreference.com
  34. Parallel Search by [[Tom Kerrigan]
  35. "How To" guide to parallel-izing an engine by Tom Kerrigan, CCC, August 27, 2017

Up one level