In this article we will discuss about minimax and negmax procedure of heuristics in game playing.
Minimax Procedure of Heuristics in Game-Playing:
The heuristic search techniques are not directly applicable to the search for a winning strategy in board games. The basic difficulty is that there is not a single searcher for the path from source to goal state (not one player). Rather, games involve two players determined to defeat each other.
This fundamentally alters the nature of the search to be conducted. By combining the results of the static evaluation function with two fairly obvious heuristic rules, we obtain the minimax procedure first clearly stated by Claude Shannon (1949).
The two heuristic rules state:
ADVERTISEMENTS:
(a) When A has the option to move, she will always choose the alternative which leads to the MAXIMUM benefit to her, that is, she will move to the subsequent state with the largest value of fj, (f is the evaluation function)
(b) When B has option to move, he will always choose the alternative which leads to the MINIMUM benefit for player A, that is he will move to the subsequent state with the smallest value of fj.
There are several salient features of the minimax procedure. First, it assumes a perfect opponent that is the player B never makes a bad move from his point of view. Second, the larger the number of look-ahead ply, the more information can be brought to bear in evaluating the current move.
In the best case of a very simple game in which an exhaustive search is possible, all nodes are evaluated, and the win/loss/draw information can be propagated back up the tree to indicate precisely the optimum move for player A.
ADVERTISEMENTS:
Examples of each of these moves are shown in Fig. 5.3 and 5.4. In these examples, the static evaluation function produces a normalised number on the range 0 < fj < 1 which is proportional to the value of the nodes to player A. “1” would guarantee player A a win and a “0” means a win for player B.
If, after one or more ply (levels of play), the two moves open to player A evaluate to 0.23 and 0.67, her wisest move is to the 0.67 position since it will maximise her advantage. Since the minimax procedure specifies that she should choose the move with the highest value, this value propagates upward, giving a value of 0.67 to her starting position (Fig. 5.3).
By the similar arguments calculating her subsequent move, player A must assume that B will choose the alternative indicated which will minimise the value of the board position to player A (0.15, Fig. 5.4).
ADVERTISEMENTS:
Pseudo code for minimax is shown below:
To find the minimax score for a node n
find minimax score of each child of n
ADVERTISEMENTS:
if it is Arisha’s turn
score of n is the maximum of children’s scores
if it is Bobby turn
score of n is the minimum of children’s scores.
ADVERTISEMENTS:
Adversial Nature of Game-Playing affects the Evaluation Function:
In the dominoes game (Fig. 5.2) we were able to assign each leaf node as a definite win either for A or for B. By tracing back we were able to assign a similar value for each intermediate board position. But normally such a complete information is not available and we have to rely on an heuristic value.
The value of this heuristic function depends upon the nature of the game. Fig. 5.5, shows an example tree, with evaluation functions for each position which can have value say between +10 and -10. Heuristic values are un-bracketed numbers. Only a part of the tree is shown. Dashed lines represent moves of Bobby, whereas moves of Arisha are shown by thick lines.
For Arisha’s move there are some good position with scores of 5 and 7, and some very bad ones (score = -10). But she cannot just decide of her own to take path to the best node; with score 7. In order to reach node; she moves to C. Bobby might choose to move to position g rather than moving to f. So Arisha would never reach node j and achieve evaluation function of value 7. Now let’s consider some nodes up the tree (toward leaves).
Consider position i. It is Bobby’s move and he will obviously move to the best position for him, that is the child with minimum score. By looking ahead at Bobby’s move and moving this up the actual value assigned to node i will be -3. But the value of the heuristic function at node i which was 2, thus by looking ahead at Bobby’s move Arisha will not stay at this. By moving up the actual value assigned to node i will be changed to -3 (shown in bracket) instead of the original value, 2.
Now consider node d which is inside the tree. It is Arisha’s move. If she had predicted Bobby’s move (using the argument above) her two possible moves are h with score -2 or i with score -3. She would want the best move for her that is maximum score.
Thus, the move made would be to hand position d can be given the revised score of -2 (in the bracket). This process has been repeated for the whole tree. The numbers in the brackets show the revised scores for each node and the solid ones show the chosen moves from each position. And this is the speciality of the heuristic function that it does not remain static because of the adversial nature of the game.
With the procedure one chooses the minimum (for adversary’s move) and the maximum (for one’s own move) hence this process is called minimax search.
Limitations of Minimax during Heuristics in Game-Playing:
Now suppose that Arisha moves to c and Bobby moves to f. Arisha will be able to respond with a move to j, giving a scores of 7, rather than the worst score of 1 (value in the bracket) by moving through g. Thus, if Arisha does not take the indicated moves a good opponent will fight her down below the minimax score. Thus a risk is involved in minimax.
The effectiveness of the minimax procedure is limited by the depth of the game tree, which itself is limited by the time needed to construct and analyse it, (the time increases exponentially with the depth of the tree).
There is another risk involved in the minimax heuristic; a rapid tie between the assigned and actual value to static evaluation function fails to decide the expansion of the game tree.
For example:
Imagine that Arisha looks ahead only two moves up to the level d-g. A minimax search at this level gives scores to 5 to band -2 to c, so Arisha would move to b, whereas by looking further ahead c would be better. Looking even further ahead our choice might change again and it does. These rapid changes in fortune are a constant problem in determining when to stop in examining the game tree.
Horizon Problem:
Although the minimax procedure gives the optimal state with the given node which has been examined, there is no guarantee that the score is optimal. The actual score may be worse or better as the game proceeds. This is so because the tree examined in determining the next move is only a part of it and not the whole one, infact the whole trace can not be searched.
Let us once again look at the figure of minimax game tree (Fig. 5.5). Nodes or position a, b, d, e have the same heuristic score, 5 that is they lie on a plateau, as in the case of hill climbing. They tell nothing about the movement of the game. The better estimate of the score for each position is obtained beyond plateau. This effect, horizon effect, a term coined by Berliner (1977) plays a dominant role in selecting chess heuristics.
So minimax heuristic:
1. Does give us a good score but no good solution.
2. Gives no to which nodes to examine further.
In case no other knowledge to guide our search is available the tree around a plateau is examined in a breadth-first manner, which becomes quite cumbersome.
One rule for examining nodes is to look precisely at those nodes where there is a lot of change, that is, ignore the plateaux. This is based on the observation that rapid changes in the evaluation function represent interesting traits of the game.
Negmax Procedure of Heuristics in Game-Playing:
An elegant simplification of the minimax formalism was introduced by Donald Knuth and R.W. Moore (1975). The negmax procedure uses the same function as used in minimax procedure to evaluate each node in backing up from the successor nodes.
Assume the evaluation function for the jth terminal node in a complete tree given as:
fj = -1 for loss
fj = 0 for draw
fj = + 1 for win
The negmax evaluation function F(j), may then be written:
F (j) = f for terminal nodes
F (j) = MAX {-F (i1),-F (i2), -F(in)}
for successor states (i1) to in.
The formalism states that the best move for either player is that which maximises f(j). This is illustrated in the simple win/loss/draw game tree shown in Fig. 5.6. By simply selecting the move which maximises the negative of the successor position’s evaluation function, the negmax formation implements the minimax heuristic.