Difference between revisions of "Oracle"

From Chessprogramming wiki
Jump to: navigation, search
Line 16: Line 16:
 
  One thing that we used to do with our chess program, [[Duchess]], was to have a special routine that ran before the [[Search|main search]] that identified strong and weak points in each side's respective position - weak squares, [[Passed Pawn|passed pawns]], weak [[King Safety|king protection]], etc, and tried to grow trees of moves to protect our own weak squares or attack the opponent's weak squares. These would be just sequences of moves by our own or our opponent's  pieces and pawns to reach the targeted squares. A table was build for the value of reaching each of these goals for each type of piece and for reaching the middle points along the paths (the value diminishing the further they were from the goal). Then during the search the program would get bonus points for reaching some of the intermediate points in the plan.
 
  One thing that we used to do with our chess program, [[Duchess]], was to have a special routine that ran before the [[Search|main search]] that identified strong and weak points in each side's respective position - weak squares, [[Passed Pawn|passed pawns]], weak [[King Safety|king protection]], etc, and tried to grow trees of moves to protect our own weak squares or attack the opponent's weak squares. These would be just sequences of moves by our own or our opponent's  pieces and pawns to reach the targeted squares. A table was build for the value of reaching each of these goals for each type of piece and for reaching the middle points along the paths (the value diminishing the further they were from the goal). Then during the search the program would get bonus points for reaching some of the intermediate points in the plan.
  
  This seemed most useful in the [[Endgame|endgame]] when actually reaching the culmination of a play might be beyond the [[Depth|search depth]], and where [[Tactics|tactics]] did not dominate as much as they do in the [[Middlegame|middle game]]; for example, Duchess was quite capable of finding long King maneuvers that might take the King far away from simpleminded heuristics such as "[[King centralization|centralize the King in the endgame]]" and that were too deep to be found by a direct search. It wasn't perfect; it could not take into account the changes in strategy that might be dictated by a radically different structure encountered deep in the search, but it seemed to be better than nothing.  
+
  This seemed most useful in the [[Endgame|endgame]] when actually reaching the culmination of a play might be beyond the [[Depth|search depth]], and where [[Tactics|tactics]] did not dominate as much as they do in the [[Middlegame|middle game]]; for example, Duchess was quite capable of finding long King maneuvers that might take the King far away from simpleminded heuristics such as "[[King Centralization|centralize the King in the endgame]]" and that were too deep to be found by a direct search. It wasn't perfect; it could not take into account the changes in strategy that might be dictated by a radically different structure encountered deep in the search, but it seemed to be better than nothing.  
  
 
==David Kittinger==
 
==David Kittinger==

Revision as of 15:15, 26 October 2018

Home * Knowledge * Oracle

John William Waterhouse - Consulting the Oracle [1]

Oracle,
an instance with the fame of perfect knowledge, that is evaluation without noise or errors, a form of divination, and a perfect supervisor in learning. For instance endgame tablebases act like an oracle, providing the optimal moves.

Pre-processing

A so called oracle approach is a knowledge instance applied at the root or the interior of the tree with some depth left to the horizon, which guides search and leaf evaluation for specific features or pattern, relevant in particular chess positions, and not in others. This applies to first order evaluation terms by initializing Piece-Square Tables, also dubbed Pre Scan Heuristics [2] as propagated by David Kittinger, as well as second order terms if related pattern match as described by Hans Berliner in Some Innovations Introduced by HiTech [3].

While Pre-processing seems somewhat outdated with todays search depths in conjunction with transposition table anomalies and resulting search instabilities, Oracle approaches may be used to control time management, to order moves, or, as proposed by Ronald de Man in scoring root moves by slightly shifting the alpha-beta windows [4].

Quotes

Bruce Wright

Bruce Wright on pre-processing in Duchess [5]

One thing that we used to do with our chess program, Duchess, was to have a special routine that ran before the main search that identified strong and weak points in each side's respective position - weak squares, passed pawns, weak king protection, etc, and tried to grow trees of moves to protect our own weak squares or attack the opponent's weak squares. These would be just sequences of moves by our own or our opponent's  pieces and pawns to reach the targeted squares. A table was build for the value of reaching each of these goals for each type of piece and for reaching the middle points along the paths (the value diminishing the further they were from the goal). Then during the search the program would get bonus points for reaching some of the intermediate points in the plan.
This seemed most useful in the endgame when actually reaching the culmination of a play might be beyond the search depth, and where tactics did not dominate as much as they do in the middle game; for example, Duchess was quite capable of finding long King maneuvers that might take the King far away from simpleminded heuristics such as "centralize the King in the endgame" and that were too deep to be found by a direct search. It wasn't perfect; it could not take into account the changes in strategy that might be dictated by a radically different structure encountered deep in the search, but it seemed to be better than nothing. 

David Kittinger

David Kittinger and Scott McDonald in 1984 on Pre Scan Heuristics of the Super Constellation [6]

A second departure from other commercial programs has been the simplification of the evaluation function as applied to the end nodes of the tree search. The programs instead rely heavily on specific chess knowledge which is concentrated into a special preprocessor which interfaces to the tree search primarily through the scores associated with specific ply-one moves. This ides of a ply-one move preprocessor was originally implemented in the program Tech by James Gillogly in the late 1960s. Although Tech only achieved a high 1400 rating running on a large computer, the strategy has certain appeal. First, chess tree searching has become very efficient, and second, the interaction problems associated with putting ever increasing amounts of chess knowledge in the tree become formidable. It has become apparent to that this rather simple approach might contain the structure of a master level microcomputer program. 

Peter Gillgasch

Peter Gillgasch on DarkThought's Oracle, February 1996 [7]

After toying with horrible slow code that did endpoint evaluations for the pieces I got the idea (as many did before, but I swear I didn't know that and hence I was quite proud of it) to analyze the root position and to modify the piece/square table relative to pawn structure and king locations. The piece/square table scores were also used for move ordering to cut down the tree size (this helped a lot since it is basically using the evaluation to order the moves at close to no cost).
Of course the program played much better than before but it had serious trouble when the pawn structure was highly flexible and if there was a way to lock its bishops with own pawns it was guaranteed to like exactly this line ;)
Later the three of us modified the oracle code (to use Berliner's term) and we where quite successful with it - DarkThought started to beat commercial PC programs. But it was quite hard to make further progress, so a lot of knowledge was moved to the endpoint evaluation up to the point that there was basically no oracle code left. This improved playing strength but it was always my impression that we suffered in the "tactical" aspect a bit since the search was no longer fast as hell and I was never convinced that going this way was the best decision. 

Chrilly Donninger

Quote by Chrilly Donninger from CHE: A Graphical Language for Expressing Chess Knowledge, 1996 [8]

The main design criterion for successor Nimzo-3 was combining the positional play of Nimzo-2 with the tactical strength of a program like Fritz. Nimzo-2 was developed on an 486/50 MHz PC, which calculated about 3,000 nodes per second. Thereof 60 to 70 percent was spent/wasted in the leaf evaluation routines. Hence, a major improvement in speed and thus in tactical strength could only be obtained by performing most of the evaluation either in the root or the interior of the search tree. So, Nimzo-3 became a Genius/Fritz-like program with a complex root evaluation, called Oracle, similar to Berliner's Oracle, and with a very simple, mainly first order evaluation at the leaves [9]. Nimzo-3 spends about 10 to 20% on leaf evaluation, its node rate has increased by 400% up to 12,000 nodes/second on the same hardware. 

Don Dailey

Don Dailey on Pre-processing, May 1998 [10] :

The term pre-processing is normally used in the context of an evaluation function. It's a process where you make most of your evaluation decisions BEFORE the search begins. Typically you determine where each piece should go and essentially build a 64 square table for each piece type of each color on the chess board. You might decide that the c file is open and so you give rooks a big bonus if they can get to the c file. Once you make this decision it doesn't change during the search. So if the c file suddenly gets blocked during the search, the rooks won't know this and still try to get on the c file.

Pre-processing has many advantages and disadvantages. Here is a list of them:

Pro's

  • Very fast, evaluation almost free.
  • Trivial to implement.
  • Can put huge amounts of knowledge in chess program with essentially no slowdown. This is because all the real work is done in a fraction of a second before the search begins.
  • Especially good with pawn patterns. (but not pawn structure)

Con's

  • The evaluation is very dependent on the starting position. Deeper searches suffer more than shallow one. Ironically, you can do deeper searches if you do lots of pre-processing.
  • Very difficult to put in dynamic endings where relationships change quickly.
  • Concepts like mobility are covered very inexactly. Pawn structure terms cannot be calculated with any reliability.
  • Makes it problematic whether to use hash table aging because it affects the hash tables in strange ways. The problem is that 2 identical positions can have different scores if they are descendants of different starting positions.

There are probably other pro's and con's I didn't think of. Many really good programs use this method, but almost always there is a mixture of pre-processing with more dynamic stuff like pawn structure which might be hashed, or calculated quickly with incremental updating schemes. There is quite a bit of diversity and opinion on this subject. Some swear by it and others shun it completely. I don't use it any more but my older programs did. My feeling is that with ever increasing depth, it becomes more and more troublesome. Imagine a program doing 60 ply searches. The end nodes of the tree will not bear very much resemblance to the initial position where the decisions were made. But since we don't usually look 60 ply deep pre-processing is still a workable method.

But I believe you can still write very strong programs that do a substantial amount of pre-processing. Fritz is one example. Genius also does substantial pre-processing. I think probably more programs use this than not. My old program RexChess was almost all pre-process with a little dynamic pawn structure mixed in! Pre-processing does not necessarily have to be pure piece/sq tables. You can make other evaluation decisions in advance too and prepare them for fast execution during the search.

Ulrich Türke

Reply by Ulrich Türke [11] :

I too think that pre-processing is still done in most of the current chess programs, more or less. I am considering the possibility of a moderated pre-processing, i. e. apply the time-consuming pre-processing evaluation parts not at the root but to nodes at depth - say - the half of the aimed search depth. Thus, one would still have the main advantage, saving a lot of calculation time, because the number of pre-processed nodes would still be a microscopic fraction of the leaf nodes. But on the other hand, the pre-processed knowledge would be applied to nodes being far closer to the leaf nodes (in your examples "only" 30 ply instead of 60). Opinions? 

Amir Ban

Note by Amir Ban [12] :

It's a fair but simplistic description of Junior. 

Don Dailey

Don Dailey again [13] :

I've been considering this type of approach for a long time. I think it definitely has merit. It would suffer from hash table anomalies but there are ways to deal with this, the primary one to simply ignore them.
There are a few issues to be dealt with but my feeling is that you might do well with this scheme. Scoring might be a little fuzzy at times but that might be ok. Probably the scheme would be to pre-process at every level up to MAX_DEPTH-n, where n might be at least 3 or 4 ply. Keep in mind that pre-processing is incredibly expensive even compared to fully dynamic processing, so I suspect you cannot get very close to the leaf nodes. Of course 3 or 4 ply away might be considered pretty close if you were doing 14 ply searches.
The way I got this idea (which I'm not taking credit for) actually began when I was programming the ending Bishop and Knight vs King with a pre-processor program of mine a few years ago. This is a very good exercise for someone wanting to understand some of the difficulties of pre-processing. I tested with a 3 ply search, and tuned the thing up to do quite well. But it turned out to not work so well at other levels! It turns out that no matter what table you use, they will tend to be optimum for a specific depth only. It was almost funny because I kept figuring out which positions in the search were messing it up and changed the tables accordingly. These anticipations were wrong though at different levels! Finally though, I extracted the most general principles possible and created a table that was almost static (not based as much on the placing of the pieces) and it did ok, but not as good as the depth specific ones. Some pre-processors will suggest a move and for this ending that would work well. The idea is to give a bonus to the whole search tree under some root move. For example I might give a small bonus for playing any move that's a pin. This makes the program slightly in favor of playing a pin. I call these "ply 1 incentives" but there is probably better terminology in use. This is also another example of pre-processing. See note at end of my post.
But your idea would solve this and a host of other problems. You could make very good guesses about configurations if you knew the possible changes to a position were limited. This would enable you to be more aggressive about the knowledge you included.
Here is another idea I had which is more dynamic in nature. It occurred to me that you could do a fairly decent evaluation based on pawn position only, the idea being a separate set of tables for each pawn structure. These could be built by the pawn structure hash table mechanism most programs have. Most pieces can be judged well by the pawn structure around them.
I did not pursue this idea because I fear it may still be too slow. If my hash table hit rate was say 95 percent, then 1/20 of the positions would require a pre-processing step. While this does not seem like much, you have to consider how expensive building the table is. It's pretty expensive, especially if you want sophistication. But on top of that you must recompute the positional score each time the pawn structure changes, EVEN if the table itself is ready to go. This should be fairly fast but not trivial in terms of execution speed. You have already killed some of the speed you were hoping to gain. Is the idea workable? Maybe, but I'm skeptical. Also, the pawn structure is not the only relevant factor so other evaluation would not doubt be needed. For instance you cannot look at the pawn structure and know for sure that you are in an endgame. You could make assumptions though based on the exact position you had when you built the tables.
Anyway the idea is interesting and I sometimes toy with the idea of trying it out, but it seems like a lot of work for an experiment that may not pay off. It could well be that someone dedicated to making it work might find imaginative solutions to some of the problems. But your idea may be more sound. It may very well be that some are using it. I've never heard of a specific example of it though. 
I think an old program called Tech used the method of "ply 1 move incentives." There was no evaluation at all (the way I remember it) except for material value and a list of moves was prepared in order of goodness from a pre-processor at the root. The program simply played the first move it came to in the move list that was at least as good tactical as any other. I'll bet Bob Hyatt knows about this one. I had a nice discussion once with the programmer when he was giving a talk on hash table collisions in chess programs. 

Vincent Diepeveen

Vincent Diepeveen on Pre-processing, October 2000 [14]

It's pretending you know in advance what is the best move and just let the program figure out the tactics. It's fast and of course my first line is a rude outline of what it is, but it's definitely no compare to a leaf evaluation.

So the 2 things against each other:

  • Prescan heuristics/ piece square tables:
fill a table BEFORE you search and apply there the knowledge, so for example for each piece at each square and in your makemove only add the score from the tables and do nothing in your leafs (don't see why one would need a qsearch in fact for such programs a 'guess' function should be plenty enough). No evaluation needed. Even passers you can easily do incremental, no problem.
  • Leaf eval: you can sure use the makemove to build up data structures, but the evaluation is NOT dependent upon the moves made, but only upon the position and you go scan the board position in each leaf for chess knowledge.

So the big difference is for example when exchanging queen: Chess Tiger used (don't know whether it's still a pre-processor) to be happy before queen exchange, then after queen exchange it's suddenly very unhappy. Old Fritz versions except the latest 6a, 6b etcetera were also pre-processors.

Of course many programmers figured out that using a pre-processor only is very hard to play chess, so usually a few big scores they do in eval like passed pawns. Many old programs simply had no king safety at all.

Nimzo is a classical case of pre-processing too. if you checkout its book you'll see it hardly plays openings with very tough pawn structures without having all tough lines in book, as otherwise the pre-processor makes big mistakes as it doesn't see the position change in the root of course.

Now where is the big advantage of pre-processors apart that you can do it faster, why have they been so pretty successful in the past? I think very important reason is that it's also very easy to debug a pre-processor when comparing that with a leaf evaluation.

See also

Publications

Forum Posts

Re: computer chess "oracle" ideas... by Ronald de Man, rgcc, April 3, 1997
Re: computer chess "oracle" ideas... by Ronald de Man, rgcc, April 7, 1997
Re: What is "pre-processing" by Don Dailey, CCC, May 07, 1998

External Links

ORACLE (computer) from Wikipedia

References

  1. Consulting the Oracle by John William Waterhouse, 1884, showing eight priestesses in a temple of prophecy, Oracle from Wikipedia
  2. PSH from Schachcomputer.info Wiki (German)
  3. Hans Berliner (1987). Some Innovations Introduced by Hitech. ICCA Journal, Vol. 10, No. 3
  4. Re: computer chess "oracle" ideas... by Ronald de Man, rgcc, April 7, 1997
  5. Bruce C. Wright (1996). Routines prior to Main Search. Computer Chess Reports, Vol. 5, Nos. 3+4, p. 102
  6. David Kittinger and Scott McDonald (1984). Report from the U.S. Open. Computer Chess Digest Annual 1984 pp. 15-33
  7. Drawn games (Was Re: Transposition table) by Peter Gillgasch, rgcc, February 06, 1996
  8. Chrilly Donninger (1996). CHE: A Graphical Language for Expressing Chess Knowledge. ICCA Journal, Vol. 19, No. 4
  9. This approach seems to have been invented by Kaare Danielsen for his program CXG Advanced Star Chess, Quote from Donninger's CHE paper
  10. Re: What is "pre-processing" by Don Dailey, CCC, May 07, 1998
  11. Re: What is "pre-processing" by Ulrich Türke, CCC, May 08, 1998
  12. Re: What is "pre-processing" by Amir Ban, CCC, May 08, 1998
  13. Re: What is "pre-processing" by Don Dailey, CCC, May 08, 1998
  14. Re: What is Prescan Heuristics ?? by Vincent Diepeveen, CCC, October 13, 2000

Up one Level