|
BGonline.org Forums
How to design a statisctally significant bot head-to-head?
Posted By: eXtreme Gammon In Response To: How to design a statisctally significant bot head-to-head? (Ian Shaw)
Date: Sunday, 10 January 2010, at 7:58 p.m.
You will need a lot of games:
Let’s assume each player will win 24% of their game as Gammon and let’s ignore backgammons.
Let’s assume Bot A wins 50.50% of the games so 0.012 ppg ( from Elo=ppg/0.1*50 for cubeful game, let’s say Elo=ppg/0.1*100=12.4 elo difference)
You have a standard deviation around 1.31 and after 15,000 games your 95% confidence interval is around 0.021. so your result is not yet significant
- To reach 95% confidence you will need 50,000 games.
- To reach 99% confidence you will need 80,000 games.
If A wins only 50.25% (6 Elo difference)
- To reach 95% confidence you will need 160,000 games.
- To reach 99% confidence you will need 300,000 games.
If A wins 51% (25 Elo difference) it's much easier to get stat sig
- To reach 95% confidence you will need 11,000 games.
- To reach 99% confidence you will need 20,000 games.
This is usable for low levels (when training the NN i was doing that on 1-ply over a few millions games to test the new NN against the old one).
for High level is is a HUGE undertaking.
|
BGonline.org Forums is maintained by Stick with WebBBS 5.12.