| |
BGonline.org Forums
Need help to get correct match score and cube pos on this one...
Posted By: Henrik Bukkjaer In Response To: Need help to get correct match score and cube pos on this one... (Casper van der Tak)
Date: Thursday, 10 February 2011, at 12:54 p.m.
Yes, apart from what it says here, I can tell you a bit more on TD-Gammon. This is something I read on an internal IBM article I think, when I was looking for stuff on neural networks for other (non-gaming) purposes.
Gary did a neural net based backgammon program before TD-Gammon, called Neurogammon. I don't know if it was ever available. It was trained using another approach - a lot of expert games and moves. And it was designed specifically for the purpose of playing backgammon (ie. with backgammon expert knowledge being used while designing and implementing).
Then he did TD-Gammon 0.0, which is not mentioned in the original article (or at least not in the tables). That was indeed the breakthrough program, because it used 0 lookahead (based purely on neural net evaluation of the inputs), and it used NO expert backgammon knowledge in design or training. Playing itself from completely scratch, only knowing the rules, until it won by coincidence, learning from that through the TD method. First games was crazy almost like those bots on fibs playing random or the one trying to loose... Games with more than 1000 moves in them. But after just a few games (10-20 games), the network was picking up things and starting to play better. At the end of the first training session, it played at the same strength as the highly tuned and optimized Neurogammon. That was the proof, that TD-Learning combined with expert knowledge could easily become a stronger combination than anything else seen before. From there on Gary developed TD-Gammon 1.0 with a bigger internal net, and later added versions that did ply-lookahead, exhausting combinations though using a search tree.
He also did 3.0 that had an even bigger net, and used one more ply in the lookahead search tree, and as far as I know actually scored positive vs. either Neil Kazaross or Nack Ballard (though in a very small sample size).
Anyway, Gary Tesauro played around with an alternative idea to the search-tree-ply-lookahead approach, but I don't know if anyone ever gave it a go?
| |
BGonline.org Forums is maintained by Stick with WebBBS 5.12.