[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums

The True Test of Skill is Duplicate BG

Posted By: Sam Pottle
Date: Tuesday, 28 September 2010, at 7:51 a.m.

In Response To: The True Test of Skill is Duplicate BG (Perry Gartner)

The usual notion of statistical significance for a rollout goes something like this: "This rollout says that play A is better than play B, and we're 99% confident that this result would not be reversed by an infinitely long rollout. Oh, and the rollout says the size of the error is .023." The statsig analysis is applied to the question of whether we know the best play, not the size of the error when the wrong play is made.

For a quiz, this is not enough. You also want to know which errors on the quiz are bigger than which other errors, so that you can say with confidence that player X actually outperformed player Y.

The quiz at the Chicago Open this year is a case in point. UBK and I tied for first in the quiz event. Stick was third, by a margin of something like .003. Stick and I agreed on eight out of ten problems, and there was one he got wrong that I got right, and one I got wrong that he got right. I forget the size of the errors involved, but let's suppose that my error was .044 and his was .047.

The normal sort of analysis would say that you'd like to see a jsd for each of these problems of something like .015 to .020 in order to be confident that you have the right plays. But if you want to be confident that I actually did better than Stick on the quiz, you need rollouts with jsd's around .001, which will take hundreds of times longer to perform. You need to know that the error on the problem I got wrong is almost exactly .044, whereas a normal rollout will tell you with confidence only that it is > .000.

It's actually a bit worse than that, because you may have players differing on more than two problems, and the more rollout results you have to add together, the wider the error bars get.

Messages In This Thread

 

Post Response

Your Name:
Your E-Mail Address:
Subject:
Message:

If necessary, enter your password below:

Password:

 

 

[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums is maintained by Stick with WebBBS 5.12.