this should be easy..  ROLLOUT
Posted By: Timothy Chow
Date: Sunday, 29 June 2014, at 12:32 a.m.
In Response To: this should be easy..  ROLLOUT (Tom Keith)
Tom Keith wrote:
I'm sure bot makers have tried to compute these numbers.
Just to be clear, "these numbers" refers to playing the current game with the cube frozen at its current value, but with a cubeful MET governing the values of winning/losing a single/double/triple game. Right?
Traditionally, at least, bot neural nets are trained using cubeless money play. When you feed a position into a NN you get back six numbers: the probability of winning a single game, gammon, and backgammon, and losing a single game gammon, and backgammon. In a match situation the bot has to compute a "value vector" to apply to the cubeless values. Take the inner product of the value vector with the cubeless numbers produced by the NN and you have an estimate of the match equity.
I think I follow all this. So where do "these numbers" come into play? I see cubeless moneyplay numbers coming in (Definition 1). I see the cubeless moneyplay numbers being transformed into cubeful match equity values. But at what point is the computer calculating estimates of the equity that one would have if the game were played out cubelessly but with a cubeful MET controlling the values of the game?
Knowing these numbers is valuable for human players too.
I can see that knowing, for example, how often you actually win and how often you actually win a gammon is useful for a human player. I can also see that knowing the moneyplay cubeless wins and gammons is useful, because it's nearly impossible to memorize a whole table of win/gammon figures for a reference position at all possible match scores, so learning the moneyplay cubeless values for a reference position is a useful surrogate. But how is it useful to know how often you would win if you were to play the game out cubelessly with a cubeful MET governing the single/double/triple game values? I don't see it.

