MoGo has been developped tackling specifically and efficiently theses two problems:
* position evaluation by Monte-Carlo : from a given Go board position, a fast random player plays the game until the end. Then the score can be calculated quicly and precisely directly from the Go rules. We call that a simulation. Repeating the simulations a huge number of times and taking the average of the results, the position is now evaluated. This idea appeared in 1993 in computer Go. MoGo use expert knowledge in a novel way to improve the random player, and then increase significatively the evaluation precision.
* exploration-exploitation in the search tree using UCT algorithm : alpha-beta algorithm widely used and very efficient in particular in chess, happens to be inefficient for Go. MoGo is the first Go program to introduce UCT algorithm in computer Go. Advantages are:
o asymmetric growing of the tree: the most interesting moves are more deeply explored.
o effective imprecision management: the estimation of the position value at each node vary from the average to the min-max. This depends on the confidence in the estimations. Thus, if after a sufficient number of simulations, a move becomes better than the others, the value return by UCT will be close to the max among the possible moves. If the estimated values of several moves are not significatively different, then UCT will return a value close to the mean.
o anytime: the algorithm can be stopped at any moment, while giving a good result.
It worth mentionning that most of the top level 9x9 Go programs use UCT.
Activités de recherche
°
Optimisation
Equipe
°
Apprentissage et Optimisation
Contact
[aucun]