Five-in-a-Row on a Torus
Work has the bad habit of to disturb blogging, but recently I had a nice chance to combine them.
Some time ago I got a sudden inexplicable urge to implement MTD(f), the reportedly fastest two-player game tree search algorithm. Having some background in AI and games it was enjoyable to implement and tweak. The only issue the description doesn't quite mention is how to catch the best move in the root, let alone the principal variation, but after a few trials I got that working as well.
To get something concrete, I decided to apply MTD(f) to the game mentioned in the subject: five-in-a-row, but on the surface of a torus instead of a flat sheet of paper. Classic five-in-a-row favors the first player, and in fact a strategy stealing argument shows that with correct play, the first player will not lose. More advanced variations of five-in-a-row restrict the first player's moves in various ways, but many of them, including Go-moku and free Renju, have still been proven with deep computer analyses to be wins for the first player.
On a torus, it is easy to check manually that if a 2-by-2 grid or 3-by-3 grid is drawn on the torus, the first player wins in two and four moves, respectively. With this presumption I tested my program on 4-by-4 and 5-by-5 grids, and to my dismay the program reported them as draws. I fell into a bogus bughunt, until I humiliatingly realized that those variations are indeed draws! Because the surface is a torus, it is possible to block a three-in-a-row or four-in-a-row with a single stone from both sides.
Solving the 6-by-6 game wasn't going to be easy, so I was forced against my sincere will (yeah, right!) to have some serious fun: threat-space search, good move ordering heuristics, a reasonable evaluation function, and rote learning as in Samuel's Checkers program. The positions were stored in the transposition and rote learning tables in a format that canonified equivalent positions obtained by reflection, translation and transposing into identical ones.
Playing against this beast was fun athough it demonstrated human fallibility in very convincing ways. But if I wanted the game solved in a time shorter than the average interval between power outages (roughly 6 months), I had to add a few extra tricks. Firstly, I changed the move ordering for the second player so that it actively seeks draws by preferring preventively blocking moves to attacking moves. Secondly, I added an algorithm which checks, with a number of pessimistic assumptions, if the position will unavoidably end in a draw even if there are empty places left. Thirdly, I split the transposition table in two: one which replaces based on recency of access, one which replaces based on effort needed to determine the position's value. Without the latter hash table, the long-running search algorithm would replace the valuable upper level nodes before returning from a deep part of the search.
Several days worth of clock cycles later the program reported that the first player can always win, but only barely because the board (the torus, I mean) is almost full when the fifth-in-a-row is played. In a few days I should be able to show the optimal play.
Oh, and the way this fun relates to work is through an course exercise: a simplified version of this code along with a little bit of network code, HTTP parsing and HTML generation serves as the basis for that course's programming exercise.
Some time ago I got a sudden inexplicable urge to implement MTD(f), the reportedly fastest two-player game tree search algorithm. Having some background in AI and games it was enjoyable to implement and tweak. The only issue the description doesn't quite mention is how to catch the best move in the root, let alone the principal variation, but after a few trials I got that working as well.
To get something concrete, I decided to apply MTD(f) to the game mentioned in the subject: five-in-a-row, but on the surface of a torus instead of a flat sheet of paper. Classic five-in-a-row favors the first player, and in fact a strategy stealing argument shows that with correct play, the first player will not lose. More advanced variations of five-in-a-row restrict the first player's moves in various ways, but many of them, including Go-moku and free Renju, have still been proven with deep computer analyses to be wins for the first player.
On a torus, it is easy to check manually that if a 2-by-2 grid or 3-by-3 grid is drawn on the torus, the first player wins in two and four moves, respectively. With this presumption I tested my program on 4-by-4 and 5-by-5 grids, and to my dismay the program reported them as draws. I fell into a bogus bughunt, until I humiliatingly realized that those variations are indeed draws! Because the surface is a torus, it is possible to block a three-in-a-row or four-in-a-row with a single stone from both sides.
Solving the 6-by-6 game wasn't going to be easy, so I was forced against my sincere will (yeah, right!) to have some serious fun: threat-space search, good move ordering heuristics, a reasonable evaluation function, and rote learning as in Samuel's Checkers program. The positions were stored in the transposition and rote learning tables in a format that canonified equivalent positions obtained by reflection, translation and transposing into identical ones.
Playing against this beast was fun athough it demonstrated human fallibility in very convincing ways. But if I wanted the game solved in a time shorter than the average interval between power outages (roughly 6 months), I had to add a few extra tricks. Firstly, I changed the move ordering for the second player so that it actively seeks draws by preferring preventively blocking moves to attacking moves. Secondly, I added an algorithm which checks, with a number of pessimistic assumptions, if the position will unavoidably end in a draw even if there are empty places left. Thirdly, I split the transposition table in two: one which replaces based on recency of access, one which replaces based on effort needed to determine the position's value. Without the latter hash table, the long-running search algorithm would replace the valuable upper level nodes before returning from a deep part of the search.
Several days worth of clock cycles later the program reported that the first player can always win, but only barely because the board (the torus, I mean) is almost full when the fifth-in-a-row is played. In a few days I should be able to show the optimal play.
Oh, and the way this fun relates to work is through an course exercise: a simplified version of this code along with a little bit of network code, HTTP parsing and HTML generation serves as the basis for that course's programming exercise.
0 Comments:
Post a Comment
<< Home