|
BGonline.org Forums
xg vs gnubg containment position
Posted By: Ian Dunstan
Date: Thursday, 31 July 2014, at 1:35 p.m.
gnubg Rollout
gnubg 72
Ian.Dunstan 158 Position ID: 33sAACjgtm0AYA Match ID: cAkXAAAAAAAE
# Ply Move Equity • 1 R bar/20 bar/19 +0.878
0.829 0.241 0.007 - 0.171 0.028 0.000 +0.878 0.002 0.003 0.001 - 0.002 0.001 0.000 0.006 Full cubeless rollout (trunc. at one-sided bearoff) with var.redn. 864 games, Mersenne Twister dice gen. with seed 714817967 and quasi-random dice Play: 2-ply cubeless prune keep the first 0 0-ply moves and up to 8 more moves within equity 0.16 Skip pruning for 1-ply moves. XG Rollout 1
White is Player 2
score: 0
pip: 72Unlimited Game pip: 158
score: 0
Blue is Player 1XGID=-aa---CBBBBB----------ddeB:0:0:1:56:0:0:0:0:10 Blue to play 56
1. Rollout1 Bar/20 Bar/19 eq: +0.966
Player:
Opponent:78.53% (G:22.16% B:0.65%)
21.47% (G:4.31% B:0.05%)Conf.: ± 0.020 (+0.945...+0.986)
Duration: 40 minutes 03 seconds1 2592 Games rolled with Variance Reduction.
Dice Seed: 28731661
Cubeless
Moves: 3-plyeXtreme Gammon Version: 2.10
Both rollouts are done cubeless and the main purpose of them was to look at how well both bots played a containment position. Cubeless rollouts will just focus on how well each bot plays the chequers, that is what I was interested in. In this position White is busted and his moves are virtually forced, hence we get to measure how well Blue can contain White in a one-sided skill position.
Unfortunately, XG works out a cubeless equity for it's trials and then only reports a cubeful one which has been calculated from the cubeless result. The cubless result for "XG Rollout 1" was +0.755. I guess you just have to trust me on this since it does not post in the results above.
Note that there is a lot of cubeless equity difference between the two first results:
gnubg +0.878
XG +0.755 (XG Rollout 1)
This difference surprised me greatly. gnubg appears to play much better on it's "normal" setting (worldclass gnu 2-ply, 8 moves, 0.16) vs XG's "normal" 3-ply (4 move interval). The two bots were supposed to have similar strength settings , unfortunately I didn't realise until now that XG's normal search interval setting is what gnubg would call "tiny".
That gnubg is winning greater than 4% more games for Blue here tells me gnubg is doing something a lot differently than XG does. I played XG a handful of games from this position and couldn't see what it was.
Following is a short XG roullout using a "Huge" search interval which I think (it's hard to know for sure from XG's documentation) is much closer to gnubg's worldclass settings.
*
XG Rollout 2
White is Player 2
score: 0
pip: 72Unlimited Game pip: 158
score: 0
Blue is Player 1XGID=-aa---CBBBBB----------ddeB:0:0:1:56:0:0:0:0:10 Blue to play 56
1. Rollout1 Bar/20 Bar/19 eq: +0.972
Player:
Opponent:78.86% (G:21.95% B:0.81%)
21.14% (G:4.11% B:0.02%)Conf.: ± 0.033 (+0.939...+1.004)
Duration: 29 minutes 57 seconds1 1296 Games rolled with Variance Reduction.
Dice Seed: 28731661
Cubeless
Moves: 3-ply
Search interval: HugeeXtreme Gammon Version: 2.10
Here the missing cubeless equity is +0.764 for XG Rollout 2. There is not much apparent difference between XG Rollout 1 and XG Rollout 2 though one would hope the increase in search interval from 4 to 8 moves would make a little difference. There is a suggestion of a small equity difference between the two XG rollouts.
Now.... why only 864 trials for gnubg? Partly because gnubg is slow to do rollouts (especially on this tired old laptop I type on now) and partly because the reported 0.006 standard deviation is low already and it takes lots more trials to get even a 0.001 sd change. I'm used to gnubg rollouts, I did hundreds, if not thousands, of them a few years ago. I believe I still have a 'good feel' for gnubg rollouts and I am trying to develop that for XG.
In terms of 'feel', it is particularly relevant to note that a sd of 0.006 will translate to ~ +-0.012 (95%CI), so gnubg's 864 trials is reporting it's result to more accuracy than the XG rollouts which are much longer. Other rollouts I have done suggest that XG needs approximately threefold trials to achieve similar reported accuracy to gnubg. You can see evidence of that claim in these trials.
|
BGonline.org Forums is maintained by Stick with WebBBS 5.12.