| |
BGonline.org Forums
Maybe there's a better way to measure takes?
Posted By: Fabrice Liardet In Response To: Maybe there's a better way to measure takes? (Timothy Chow)
Date: Sunday, 18 October 2009, at 3:58 p.m.
It's true that the cube decisions during the rollout will be subject to the defects under discussion, but these will contaminate both the "no double" and the "double/take" rollout paths, so I don't see why one would be more reliable than the other.
I was not talking about the taker's recube vig, but about the potential doubler's cube vig. In the the "double/take" path it is zero since the doubler doesn't own the cube any more. So only the "no double" path is contaminated.
But perhaps you just meant that for evals you trust the take/pass decision more than you trust the double/no double decisions.
Yes, that is even truer for the evals. When the cube efficiencies are off average, the rollout might be wrong while the eval is certain to be wrong.
I think that there might be some value in reporting recube frequency and efficiency as a rollout stat. Stick might call it superfluous, but I think it might improve our understanding of certain subtle positions. In certain positions, our intuition tells us (for example) that a recube is very likely to occur very quickly, or (for another example) that the recube efficiency will be very high or very low. If the bot reports its opinion on this question and it disagrees with us, then that would be very interesting: If the bot is right then we have learned something about the position, and if the bot is wrong then we have learned something about the kinds of positions where the bot can't be trusted.
I also think that it is a good idea, except that IMHO the relevant number is the average cube efficiency on a scale of 0 to 1. As you say, it would be very interesting if the bots displayed it, whether they are right or not.
| |
BGonline.org Forums is maintained by Stick with WebBBS 5.12.