As described in a previous article, Alpha-Beta pruning can be used to speed up minimax heuristic searching by pruning branches that will never be reached.

A quick glance from Wikipedia:

It is an adversarial search algorithm used commonly for machine playing of two-player games (Tic-tac-toe, Chess, Go, etc.). It stops completely evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move.

Alpha-Beta pruning

The alpha-beta pruning algorithm is based on the fact that one side wants to maximize the evaluated score, while the other wants to minimize it. Both sides are later referred to as Max and Min.

The algorithm maintains two values, alpha and beta, which represent the minimum score Max is assured of, and the maximum score Min is assured of, respectively. The values start at the “worst state”, i.e., \(\alpha=-\infty\) and \(\beta=+\infty\). Whenever it goes to a point where \(\alpha\gt\beta\), we know that this branch need not be examined as it will never be reached, given that both sides have found out that it’s no better (or worse) than a known, assured branch.

This image from Wikipedia demonstrates it well:

Image sourced from Wikipedia

Let’s call the above tree “Levels 0 to 4”, where Level 0 contains only one node, the root node, and it’s where the searching starts. Levels 1 to 4 contains 3, 5, 7, 9 white nodes each. Greyed-out nodes are pruned by the alpha-beta algorithm. The numbers on the nodes are the best (maximum or minimum) score assured at this branch.

The tree is searched from left to right. It first completes the leftmost two nodes at level 4 and backtracks to the first node at level 2. At this time at the level 2 node, both sides know that this branch can assure Max of a minimum score of 5. The searching then goes to the second node at level 3, and it finds that this node can assure Min of a score of 4, which is lower than the minimum that Max has been assured of. This means that Max will not choose this branch if the game has reached the first node at level 2, and we can safely cut off the searching here (prune this branch).

Then we goes to the second node at level 2, and for Min it’s a better branch, so Min will choose this path. No pruning is happening here.

After completing the first two branches at level 1, the search goes to the last branch at level 1. It finds that this branch can assure Min of 5, which is, to Max, worse than what it has been assured of (6 from the 2nd branch), so the rest of this branch is cut off. The final result is that Max will pick the 2nd branch at level 1.

Heuristic searching with alpha-beta pruning

Here’s the pseudo-code in the previous post that demonstrates a plain heuristic searching algorithm.

function heuristicSearch(game, side, depth)
  if depth <= 0
    return null, heuristicScore(game)

  want_max = (side == BLACK)
  bestMove = null
  if want_max
    bestScore = -MAX - 1
  else
    bestScore = +MAX + 1

  for move in game.availableMoves
    game.perform move
    // Recurse
    subMove, subScore = heuristicSearch(game, side, depth - 1)
    game.undo

    if want_max
      if subScore > bestScore
        bestScore = subScore
        bestMove = move
    else
      if subScore < bestScore
        bestScore = subScore
        bestMove = move
  return bestMove, bestScore

To add alpha-beta pruning, as described above, we need to maintain two values, α and β, and cut off searching when α>β.

function heuristicSearch(game, side, depth, alpha, beta)
  if depth <= 0
    return null, heuristicScore(game)

  want_max = (side == BLACK)
  bestMove = null
  if want_max
    bestScore = -MAX - 1
  else
    bestScore = +MAX + 1

  for move in game.availableMoves
    game.perform move
    // Recurse
    subMove, subScore = heuristicSearch(game, side, depth - 1, alpha, beta)
    game.undo

    if want_max
      if subScore > bestScore
        bestScore = subScore
        bestMove = move
      alpha = max(alpha, bestScore)
      if alpha >= beta
        break  // alpha cut
    else
      if subScore < bestScore
        bestScore = subScore
        bestMove = move
      beta = min(beta, bestScore)
      if alpha >= beta
        break  // beta cut
  return bestMove, bestScore

Since α and β are passed down to deeper recursive searches, they need initial values. The modified function should be called like

heuristicSearch(game, side, depth, -INF, INF)

Optimization

As seen in the image from Wikipedia:

Image from Wikipedia

We can see that if the 2nd node at level 2 is examined earlier, the 1st node would have been pruned. We want to find a way to make pruning occur more often, further improving the performance of the algorithm.

In most typical games, a partially good move often leads to better subsequences, and vice versa. This fact implies that we can sort the current steps we’re going deep into, and search better moves first. Worse moves will likely be pruned more often. This leads to a small change in the code:

function heuristicSearch(game, side, depth, alpha, beta)
  if depth <= 0
    return null, heuristicScore(game)

  want_max = (side == BLACK)
  bestMove = null
  if want_max
    bestScore = -MAX - 1
  else
    bestScore = +MAX + 1

  moves = game.availableMoves
  // ↓↓↓
  sort moves by heuristicScore(after move)
  // ↑↑↑

  for move in sorted_moves
    game.perform move
    // Recurse
    subMove, subScore = heuristicSearch(game, side, depth - 1, alpha, beta)
    game.undo

    if want_max
      if subScore > bestScore
        bestScore = subScore
        bestMove = move
      alpha = max(alpha, bestScore)
      if alpha >= beta
        break  // alpha cut
    else
      if subScore < bestScore
        bestScore = subScore
        bestMove = move
      beta = min(beta, bestScore)
      if alpha >= beta
        break  // beta cut
  return bestMove, bestScore

Code

The whole project of the Python Reversi is open-source at GitHub. The AI algorithm is written in file ai.py.

Leave a comment