Evaluation Overlap

Home * Organizations * ICGA * Investigations * Evaluation Overlap

by Mark Watkins

Here might be a historical perspective. From page 62 of Hayes and Levy, World Computer Chess, Stockholm 1974.

=Numerical Evaluation of Positions= In the section devoted to notes on the competing programs, the respective evaluation functions for CHAOS and The Ostrich are discussed in some detail. CHAOS uses nineteen features and The Ostrich only thirteen, but the latter plans to increase this number in later incarnations. They have approximately eleven features in common: 'approximately', because the concepts are divided differently. The overlap here is not surprising since all scoring functions aim to embody 'reliable' chess heuristics: the weightings may differ, but the features themselves are similar. One of the eleven concepts shared by CHAOS and The Ostrich is material: this is the dominant factor no only in these two programs but also in the majority of chess playing programs. Exceptions are Freedom, which stresses mobility, and Papa, which concentrates on 'entropy' (basically a mobility measure). Other shared concepts include mobility, control of the centre, castling, king safety and assorted terms concerned with pawn structure: the latter include bonus points for occupancy of the centre, advancement, passed pawns, a bonus for doubling opponents' pawns with a penalty for doubling one's own, and a penalty for blocking development of one's own pieces. The delicate matter of handling pawn structures has not been mastered satisfactorily by any current program [...] For instance, all seem to place disproportionate faith in the concept of doubling the opponent's pawns (see records of games 1 and 4). CHAOS also considers the number of threatened pieces, pins and discovered checks, king end-game position, and capturing and mobility potential.

The Ostrich has an interesting term, concerned with tempi, which aims to penalize time-wasting moves such as taking two moves to reach a square that could be reached in one, or repeating a move (see game 24, where the Swiss program Tell causes a draw by repetition). [...]

CHAOS
4. Evaluation function. [...] Nineteen factors are taken into account, and their weights modified according to the stage reached in the game.
 * Threatened pieces, the sum of the value of all pieces which have enemy pieces bearing down on them, but not necessarily en prise.
 * Capturing potential, an 'adjusted' sum of the value of capturing potentials on all squares on the board.
 * Mobility, defined as the number of legal moves.
 * Centre control, which covers both occupancy and attack on centre squares and reduces in importance as the game progresses.
 * Pins and discovered checks.
 * Material, which contributes the greatest amount to the evaluation. Values are the conventional ones with no values assigned for the king.
 * Queen development, a penalty for developing early.
 * Double threats and captures. Double threats carry the value of the second most valuable piece able to be captured by the moving side, on the grounds that if two pieces can be attacked, the lower valued one will be captured. Captures represent the total value of piece more strongly attacked than defended. This is also an 'adjusted' value which attempts to approximate the net value to the side moving, after an exchange has taken place.
 * Attacked pieces, the total number of pieces attacked more strongly than defended.
 * Rook usage, which rewards castling, doubling of rooks, occupation of open files, and rooks behind passed pawns.
 * Mobility potential, a measurement of the legal moves as well as the 'not quite legal' moves (e.g. moves which guard, or would have been legal if out of check, or which bear through and along the line of attack of another piece, etc.). This carries less weight than mobility proper.
 * Pawn usage, which includes pawn advancement, totally or partially unblocked pawns, connectedness, doubled pawns, etc.
 * King endgame position. There are rewards for forcing the enemy king to the edge, stopping an opposing unblocked pawn, staying with the 'square' of the opponent's passed pawn, and king opposition, etc.
 * Development. The early development of minor pieces is encouraged.
 * Queen pins, which counts pieces pinned against the queen as well as discovered attacks on the queen.
 * Attack on king. There is a reward for attacking close to the opponent's king.
 * Best capture, the value of the highest valued piece left en prise after a move.
 * King safety. There are incentives for attacking close to one's own king.

The evaluation function has been based as far as possible on general principles, avoiding special cases. For testing purposes the weights of these factors were set differently for each side while the program played against itself. This yielded valuable information about best settings.

The Ostrich
The static evaluation function. This consists of thirteen subroutines each corresponding to a basic chess heuristic:
 * Material. The subroutine which computes the difference between White's and Black's material the greatest single value to the overall scoring function. The pieces have their conventional values.
 * Material ratio term, which computes whether an even exchange of material has occurred between the top node of the tree and the bottom position being evaluated; a bonus goes to the side ahead in material.
 * Castling.
 * Board control, which is intended to increase one's own mobility and restrict one's opponent's. There is a small bonus for each square controlled, centre squares and squares near the enemy king have the greatest score. 'Control' is defined as the ability of the piece in question to capture a hypothetical enemy piece on that square.
 * Tempi. Moving the same piece twice in the opening, moving a king or rook before castling, moving a piece back to its immediately previous position and moving to a square in two moves when it could be done in one, all attract penalties.
 * Early queen moves. These attract a penalty before the eighth move of the game; by which time most minor pieces are developed and the king has castled.
 * Blocking central pawns. 'Clogging' a position is penalised.
 * Development of pieces. Rapid development is encouraged by giving a penalty to unmoved minor pieces or central pawns.
 * Central pawns. These carry a bonus
 * Pawn structure. Advancement of pawns is encouraged and doubled pawns penalised.
 * King safety. To guard against king-side pressure on the part of the opponent the program encourages its own pieces in its own king-sector.
 * Passed pawns. The goal is to encourage the advancement and queening of pawns along with trading off the opponent's passed pawns before they become too advanced. A passed pawn receives credit according to its advancement.

Chess 4.0
[...] In addition to material score in Chess, terms are added which express in a primitive way, notions of mobility (number of squares attacked), pawn structure (passed, isolated, doubled, backward, etc.), piece placement (e.g. rooks on seventh rank), and king safety (king in castled position, adequate pawn cover). [...]

Kaissa
One of the papers which is available (Adelson-Velskii et al. 1970 ) describes the program as it was in the late 1960s and from the evidence available it is likely that Kaissa has many of the same features. [...] This paper has notes on the evaluation function; the present function is likely to be similar, if not identical. Bonus points are given for: Penalties are incurred by
 * 'A phalanx', i.e. two pawns side by side, on ranks 4-7 for White, and ranks 5-2 for Black. Three pawns side by side count as two phalanxes.
 * Centre pawns.
 * Pawn attack on the centre.
 * Passed pawns.
 * 'Scope', which is the calculation of the influence that each piece exerts on all squares, occupied or unoccupied by either own or enemy pieces.
 * Attack on undefended pieces and pawns on squares adjacent to the king.
 * Attack by a minor piece on a 'hole' (weak square). A hole is defined as a square, on [ranks] 1-5 for white and 8-4 for black, under attack by an enemy pawn, undefended by own pawn and with no chance of defence even after pawn advance.
 * Minor pieces standing in an opponent's hole.
 * Knights in the centre.
 * Rook on an open file, or threatening an open file.
 * Castling.
 * A hole.
 * A weak pawn (one behind a hole).
 * Isolated pawns.
 * Doubled pawns.
 * Pawns which are isolated and doubled.
 * Forfeiture of castling.
 * Opponent's castling.

=Analyse= Another thing to discuss could be Chapter 6 of Hartmann's Notions of Evaluation Functions tested against Grandmaster Games in Advances in Computer Chess 5 (1989).

Here is a brief synopsis. First there are three notions of mobility: Legal Moves, Pseudo Moves, and de Groot Moves. These should be self-explanatory.

Center Control
With B as a bonus for a given square, this is B*[AT+2*OC] where AT is #attackers and OC is #occupants.

Levy Development
This is a rather complicated construct, that doesn't seem of much direct relevant here.

Queen Bonus
CHESS 4.5 uses 0.8*AT-0.8*DI where AT is #squares attacked that are not attacked by enemy pawns or minor, and DI is the minimum of the rank/file differences to the enemy king. ANALYSE rescaled this to AT-DI (so as to make everything integral).

Rook Bonus
Chess 4.5 used 1.6*AT-1.6*DI+8*DR+8*op+3*SO+22*SR where AT is #squares attacked, DI as above, DR is #rooks in file or rank w/o intervening pieces, SR is #rooks on 7th rank, OP is #rooks on open files, SO is #rooks on semi-open file (defined as File in which there is no [own] Pawn, and at least one enemy Pawn that is no defended by a Pawn and that cannot move one single square w/o being attacked by a Pawn). ANALYSE rescaled to make it integral.

Bishop Bonus
Jaap Herman helped with this, as PS+EM+OM+DL+OB+OW, where PS #pseudomoves, EM is sum of enemy material on diagonals of the Bishop (unclear if it includes x-rays, even for blocked pawns?) where P=0,NB=3,R=5,Q=9,K=10, OM is sum of own material forwards on diagonal of the Bishop, P=0,NB=3,R=5,Q=9,K=0, DL is length of diagonals [#pseudomoves on empty board] minus 7, OB is pawn obstruction of enemy pawns (-25,-15,-10,-5,+1 according to specific characteristics), OW is own pawns, every pawn on diagonal in front of bishop is -5 if rank < 4 and +1 otherwise, and every pawn on diagonal behind the bishop is +1.

Knight Mobility
Hartmann discusses having this, and then points out that it doesn't seem to contribute one way or the other, so he concludes "It is therefore no use to have a separate mobility component for the knight, detached from the Pseudo Mobility."

King's mobility
Again Hartmann points out that this might be useless, but it seems OK in the endgame.

Attack/Defence

 * Different Attacked Squares: A number from 3 to 64 for each side.
 * Attacked Own Squares: Only count those squares on ranks 1-4.
 * Attacked Opponent Squares: Only count squares on ranks 5-8.

=Final Definition= MAT+PS+DIF+0.5*CEN+LD+2*RB+QB+2*BB+0.5*DEF+OFF MAT is material, PS is #pseudo-legal moves, DIF is #different attacked squares, CEN is centre control, LD is Levy development, RB is rook bonus, QB is queen bonus, BB is bishop bonus, DEF is #attacked own squares, OFF is #attacked opp squares.

=See also=
 * Quantifying Evaluation Features
 * Rybka Controversy

=References= Up one level