Science Sunday: Go to the Computer!
Today in news that makes me happy, a computer named Crazy Stone is playing Go so well it can defeat the Japanese Go champion! This is probably the sexiest supercomputer since Watson. Though I don’t quite get the way the article is framed, that is, as a competition between machines and humans, since humans built machines. If we lose to them, it demonstrates how good we are building them and solving interesting problems with them. So a loss to one of our own machines still makes us really awesome. It also raises the question of if we can invent a game so difficult that we ourselves cannot build a machine to beat ourselves at it.
Algorithmically, the reason this is interesting is because this is not a repeat of IBM’s chess playing supercomputer Deep Blue. Deep Blue was running some form of minimax algorithm. Minimax is a deterministic strategy to find an optimal strategy. It is trivially demonstrable with Tic Tac Toe. At each point in game play, the computer generates all possible future game points in a tree until it has reached an end game state. At that point, the computer just marks the game as won, lost or drawn, and can then search the tree to find a path that leads to a winning (or draw) state.
The reason IBM needed a supercomputer to do this for chess is because the game tree for minimax has a factorial expansion. Since Tic Tac Toe has 9 squares and there are 9 possible first moves, the game tree size is 9!. This is sufficiently big that when I was doing this as an undergrad exercise on a cheap laptop, I had to hardcode in opening moves so that my machine only had to generate a tree of size 8!. Chess has 64 squares. While there are, I think, “only” 20 possible first moves in chess, that means the tree expansion for minimax is 20!, which is a very large number of things for a computer to be able to calculate in any reasonable amount of time, and Deep Blue was actually not even generating the full tree, it was using a heuristic to determine whether non-final game states were good, for a certain value of good.
Note that it is very difficult to translate a human’s sense of what moves are “good” into unambiguous instructions for a computer.
Getting back to Go, there are two reasons that just running minimax isn’t going to work. Most importantly, it’s not a solved game, meaning we can’t necessarily predict the outcome from any particular intermediate state. I can’t explain why without a more thorough knowledge of Go than what I’ve gleaned from watching Hikaru no Go.
So instead of some deterministic method like Minimax, Crazy Stone is investing in randomness by running a Monte Carlo algorithm. This is almost like minimax, but instead of generating a tree with all possible game states, it generates a tree with some (randomly selected) game states. However, unlike Deep Blue, Crazy Stone must keep generating intermediate game states until it reaches an end state, because there is no known accurate heuristic to determine if an intermediate game state is good or not.
Interestingly enough, such a Monte Carlo algorithm, though relying on randomness, will eventually converge to a minimax algorithm. The sticking point is that it may do so only after a huge amount of time. Particularly in Go, since a Go board has 361 possible positions on which to place a stone at the beginning of the game. That means a full minimax tree would have 361! nodes. The thought of trying to calculate 361! anything fills me with a terrible sense of futility.
Currently, Crazy Stone is only winning with a handicap, and then not by very much, but it does demonstrate that randomness is a way to avoid certain computational difficulties with very large numbers. Onward, Crazy Stone! Demonstrate the amazingness of human computational ingenuity!