Björn Löfroth and Johan Dalenius

Abstract

Self-learning of the Connect 4 game

We have implemented a self-learning computer player for the board game Connect 4. The idea has been to only provide the player with information about the board state and allowed moves in order to force it to learn all the relevant game concepts on its own. The learning is done with the reinforcement learning method TD(λ) using a artificial neural net as a function approximator for the value function. We have used a two-layer net that has been trained by the BackProp algorithm.

During training, the computer player has been evaluated against a random player and four players who were using the minimax-algorithm with a simple board evaluation function to search the game tree at different depths. The player has also been evaluated against specific board states to determine if it is learning important game concepts such as attack and defense, but also more specific ones such rows, columns and diagonals.

We have been able to train a computer player that manages to beat a random player in nearly all games. The computer player also shows strong developement against the best minimax-player (the one with depth 4). In the beginning of the training our player is beaten nearly every game, but towards the end it manages to win almost 60% of the games.