What is alpha-beta pruning algorithm?
As we said before, the disadvantage of minimax search always needs to traverse the whole game tree. So, the number of game states minimax search has to examine is exponential in the depth of the tree. Unfortunately, we can’t eliminate the exponent, but it turns out we can effectively cut it in half. The trick is that it is possible to compute the correct minimax decision without looking at every node in the game tree. The particular technique we examine is called alpha–beta pruning. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.
Different from the minimax algorithm nodes, nodes in alpha-beta pruning algorithms will be assigned two values --- value and value. α value is the lower bound utility value that can be found so far and value is the upper bound utility value that can be found so far.
For MAX, if one subtree contains value that is lower than MAX’s value, then this subtree will have the utility no greater than value. In this case, alpha-beta pruning algorithm will prune this branch. For MIN, the algorithm will prune the branch that has the utility value > value.
When first explore the node, initialize its value to and its value to . Starting from node , use DFS to search down the tree. The first successor under node is 3. So, now node , as a MIN, has the value at most 3 ( = 3).
Continue the DFS, the second successor under node has the value 12. As a MIN, node won’t choose this move. So, node is still at most 3.
The third successor under node has value 8. Node will still avoid this move. So, the value of is still at most 3. At this point, all the possible successors of are explored. Now node ’s value is sure to be 3 ( = 3, = 3). Node now has a value that is at least 3 ( = 3).
Continue the DFS, the first successor of node has value of 2. The value of now is at most 2 ( = 2) which is lower than . Node , as a MAX, will definitely avoid move to . So, there’s no need to keep search the successor of node . The algorithm will prune the remaining branches under node .
Now the algorithm explores node . The first successor of is 14. Set = 14. Since node ’s value now is at most 14 and is higher than node ’s current best move, the algorithm will set 14 to the upper bound of to 14 ( = 14) and continue exploring.
After keep exploring the successors of and updating the value of as a MIN. Now node ’s utility value is exactly 2 ( = 2, = 2) which is not the best choice now. So, node will choose the move to and its utility value is exactly 3 ( = 3, = 3).