[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums

Some stats

Posted By: Achim
Date: Thursday, 5 July 2007, at 9:22 p.m.

In Response To: Some stats (Frank Berger)

Havn't we to make a hypothesis, e.g. X beats Y in eg.G. 51% and then can check whether the hypothesis could be falsified?

I'm getting a little bit cautious now, my first numbers were totally wrong (see also this posting):

If you compare all results of two bots against the three other bots you get two normal distributions and two confidence intervalls (95%). In common words one bot is only confident better than the other bot (in the 4 bot shootout) when the difference in the results exceeds 2 (joint?!) standard deviations.

If you take the joint standard deviation Sj = sqrt[Sg*Sg + Sb*Sb] = 2*27.3 (because the standard errors are pretty the same) and the overall results of gnubg (1623) and bgblitz (1519) you get 104/54.6 = 1.90. This leads to a ~97.1% confidence that gnubg is better than bgblitz in the 4-bots shootout.

So this conclusion is only true for comparing all results each of gnubg and bgb against the other three bots. It doesn't say anything about whether gnubg or bgb is the better bot in direct competition. BGB somehow suffers from its "bad" result against jellyfish. If you take out the jellyfish results you get no confidence at all as I wrote in the above mentioned posting.

You get no 95% confidence if you compare only certain sets of 1000 matches between e.g. gnubg and bgblitz (519-481, std.err=15.79). Here the 95% confidence intervall for gnubg is 519 +/- 2*15.79 [487;551]. Because it's below 500 there is no confidence that gnubg is better than bgb (these numbers are taken from a posting at gnubg-list written by Joseph Heled). And vice versa if you take bgb's result.

My statistic lessons are also 20 years ago and I have to admit that I learned a lot today while confusing and disturbing the other readers here and at GammonU with my wrong numbers. I also admit that I won't bet a months salary that the conclusions above are correct ;-).

Ciao

Achim

Messages In This Thread

 

Post Response

Your Name:
Your E-Mail Address:
Subject:
Message:

If necessary, enter your password below:

Password:

 

 

[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums is maintained by Stick with WebBBS 5.12.