| |
BGonline.org Forums
More data to support making the 2 pt with opening 64
Posted By: Steve Mellen In Response To: More data to support making the 2 pt with opening 64 (David Rockwell)
Date: Monday, 8 February 2010, at 11:46 p.m.
I don't understand what you are saying about motivations. Setting that aside, the error in logic here lies in assuming that just because errors in play can go in either direction, we can basically cancel them out as if the odds of them going in either direction were 50/50. In reality, there is a substantial likelihood that a bot's play errors will be systemic, which is to say, there are specific positional features that they will mis-evaluate again and again. For example, a bot which is suboptimal in evaluating backgames will be suboptimal in evaluating plays which risk sending additional checkers back. Bots have evolved to the point where there may not be any of us who can point to them and declare, "this bot sucks at playing position type X," but unless someone wants to contend the latest and greatest bot has truly solved backgammon I'm pretty sure there exist position types that the bot evaluates suboptimally. It's not as if the bot knows the correct answer in every position, but if the 20-sided die comes up with a 1, it's going to give you an incorrect evaluation just to mix things up a bit. The errors are likely to be systemic rather than random.
The upshot is that for any non-perfect bot, there exists a numerical threshold below which we can't consider rollouts meaningful due to the polluting effect of systemic evaluation errors that we know are in there somewhere. For example, even if we have a very primitive bot, a long rollout that shows play X to be .1 better than play Y is likely to be correct. But if the difference is .001, we wouldn't conclude that play X is more likely to be correct by some trivial amount, we'd conclude that we have no idea. There's just no scientific basis for attributing significance to a miniscule difference like that when there's so much white noise in the countless evaluations that take place along the way.
For strong bots like we have today, obviously the threshold where we can start drawing conclusions is going to be lower, but I don't have the sort of intuition that permits me to say ".004 means something, but .001 is just noise." To me .004 looks pretty tiny too, in comparison to the whopping evaluation errors I see the bots making all the time.
| |
BGonline.org Forums is maintained by Stick with WebBBS 5.12.