| |
BGonline.org Forums
Cubeful equities: how to derive from cubeless?
Posted By: NJ In Response To: Cubeful equities: how to derive from cubeless? (Matt Ryder)
Date: Friday, 8 January 2010, at 4:09 p.m.
What is the root cause of this "odd-even effect"? I've read about it here, but I've never really understood what might be the bot's malaise...
People have discussed this before, but I believe there are two factors:
1) The neural net is set up so that all positions are input from the side on roll. This makes the evaluations for each side inherently different. Let's take the example of a position where side A is winning and has side B closed out. On A's turn, the neural net will look at the board from A's position and generate some evaluation such as 80% chance to win. On B's turn, the neural net will look at the board from B's position and generate a completely independent evaluation. One would hope that it would compute 20% chance to win, but it's not a sure thing.
Suppose there was a slight bias in the neural net whereby it always overestimated the chance to win for the side on roll. Then in the example above, it might have computed 82% and 22%. Therefore at 0-ply it would show 82%, but at 1-ply it would show 78%.
2) The way the bot is trained can exacerbate the even-odd effect. One common way of training a bot is two use it's own 2-ply evaluations to slowly train the 0-ply evaluations to become more like the 2-ply evaluations. However, if the 2-ply evaluations suffer from an "even effect", then this type of training will either retain the effect or make it worse. I think that training using odd ply evaluations would work better.
Maybe this is why my spreadsheet differs from GNU's outcomes? Care to post your formulae? Or are they proprietary?
I just looked at the GNU source code to see what they did. Interestingly, they concluded the same thing that I did, because they do pretty much the same thing. I'll describe how we both do it, compared to Janowski's paper:
In Janowski's paper, after talking about live equity and dead equity, he ultimately derives straight formulas for Ec, Eo, and Eu. For example:
Eo = Cv * (E + 0.5 * x * p)
Where Eo is the cube owned equity, Cv is the cube value, E is the cubeless equity, x is the cube efficiency, and p is the probability of winning. If you look carefully at this, you will see that Eo varies linearly with p (it's a straight line), noting that E is also a linear function of p (it's the line from (0,L) to (1,W)).
But Eo is really not a straight line. Janowski had the right idea when he said that E should be a combination of Elive and Edead. The problem is that Elive is not a straight line, it's a combination of multiple line segments (as mentioned in a previous post). So the way that both my program and GNU work is that given p, we first compute:
Elive = interpolation of p on a 3 line segment graph
Edead = interpolation of p from (0,L) to (1,W)
Then to compute E (the cubeful equity) is straightforward:
E = Elive * x + Edead * (1-x)
The result is that the graph of the final cubeful E is also 3 segmented. My program differs slightly from GNU in that I factor in doubling decisions directly into the cubeful equity, resulting in a 7 segmented graph. GNU does something roughly equivalent, which is to calculate nodouble, double/take, and double/pass as 3 different equities and then picks the correct one. I'm not exactly sure, however, that these two methods are the same. In particular, I think that if you had a position where you were on roll but the opponent was expected to double the next turn, that GNU's cubeful equity for that position would differ from my program's cubeful equity. That's because on each turn I consider the doubling decision for both players, with the idea that assuming your roll doesn't change things, I can take into account the future doubling action of the opponent. It's sort of like looking ahead 1 ply at the next doubling decision.
The other main place where my program might differ with GNU is the computation of the take/cash points. I remember when I wrote that code that there was a choice on how to do a particular interpolation and I chose an unusual way to do it (which I thought would be more accurate). I doubt that anyone else chose to do it the same way, which means that my take points may differ slightly from GNU, even given the same position evaluation and MET.
Have you contacted Janowski? He is apparently active on the GNU mailing list.
No I haven't. I kind of figured that the paper was so old (1994), that he was already aware of everyone's tweaking of his formulas since then. Perhaps if he is involved with GNU, then maybe he helped to develop their algorithm?
| |
BGonline.org Forums is maintained by Stick with WebBBS 5.12.