[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums

xg vs gnubg containment position

Posted By: Ian Dunstan
Date: Thursday, 31 July 2014, at 1:35 p.m.

gnubg Rollout
gnubg72

5X4X4X ' ' ' ' ' ' ' ' '

1X1X ' ' '3O2O2O2O2O2O '
Ian.Dunstan158
Position ID: 33sAACjgtm0AYA Match ID: cAkXAAAAAAAE

# Ply Move Equity
1 R bar/20 bar/19 +0.878
0.8290.2410.007-0.1710.0280.000 +0.878
0.0020.0030.001-0.0020.0010.000 0.006
Full cubeless rollout (trunc. at one-sided bearoff) with var.redn.
864 games, Mersenne Twister dice gen. with seed 714817967 and quasi-random dice
Play: 2-ply cubeless prune
keep the first 0 0-ply moves and up to 8 more moves within equity 0.16
Skip pruning for 1-ply moves.

XG Rollout 1




White is Player 2

score: 0
pip: 72
Unlimited Game
pip: 158
score: 0

Blue is Player 1
XGID=-aa---CBBBBB----------ddeB:0:0:1:56:0:0:0:0:10
Blue to play 56

1.Rollout1Bar/20 Bar/19eq: +0.966
Player:
Opponent:
78.53% (G:22.16% B:0.65%)
21.47% (G:4.31% B:0.05%)
Conf.: ± 0.020 (+0.945...+0.986)
Duration: 40 minutes 03 seconds
1 2592 Games rolled with Variance Reduction.
Dice Seed: 28731661
Cubeless
Moves: 3-ply

eXtreme Gammon Version: 2.10

Both rollouts are done cubeless and the main purpose of them was to look at how well both bots played a containment position. Cubeless rollouts will just focus on how well each bot plays the chequers, that is what I was interested in. In this position White is busted and his moves are virtually forced, hence we get to measure how well Blue can contain White in a one-sided skill position.

Unfortunately, XG works out a cubeless equity for it's trials and then only reports a cubeful one which has been calculated from the cubeless result. The cubless result for "XG Rollout 1" was +0.755. I guess you just have to trust me on this since it does not post in the results above.

Note that there is a lot of cubeless equity difference between the two first results:

gnubg +0.878

XG +0.755 (XG Rollout 1)

This difference surprised me greatly. gnubg appears to play much better on it's "normal" setting (worldclass gnu 2-ply, 8 moves, 0.16) vs XG's "normal" 3-ply (4 move interval). The two bots were supposed to have similar strength settings , unfortunately I didn't realise until now that XG's normal search interval setting is what gnubg would call "tiny".

That gnubg is winning greater than 4% more games for Blue here tells me gnubg is doing something a lot differently than XG does. I played XG a handful of games from this position and couldn't see what it was.

Following is a short XG roullout using a "Huge" search interval which I think (it's hard to know for sure from XG's documentation) is much closer to gnubg's worldclass settings.

*

XG Rollout 2




White is Player 2

score: 0
pip: 72
Unlimited Game
pip: 158
score: 0

Blue is Player 1
XGID=-aa---CBBBBB----------ddeB:0:0:1:56:0:0:0:0:10
Blue to play 56

1.Rollout1Bar/20 Bar/19eq: +0.972
Player:
Opponent:
78.86% (G:21.95% B:0.81%)
21.14% (G:4.11% B:0.02%)
Conf.: ± 0.033 (+0.939...+1.004)
Duration: 29 minutes 57 seconds
1 1296 Games rolled with Variance Reduction.
Dice Seed: 28731661
Cubeless
Moves: 3-ply
Search interval: Huge

eXtreme Gammon Version: 2.10

Here the missing cubeless equity is +0.764 for XG Rollout 2. There is not much apparent difference between XG Rollout 1 and XG Rollout 2 though one would hope the increase in search interval from 4 to 8 moves would make a little difference. There is a suggestion of a small equity difference between the two XG rollouts.

Now.... why only 864 trials for gnubg? Partly because gnubg is slow to do rollouts (especially on this tired old laptop I type on now) and partly because the reported 0.006 standard deviation is low already and it takes lots more trials to get even a 0.001 sd change. I'm used to gnubg rollouts, I did hundreds, if not thousands, of them a few years ago. I believe I still have a 'good feel' for gnubg rollouts and I am trying to develop that for XG.

In terms of 'feel', it is particularly relevant to note that a sd of 0.006 will translate to ~ +-0.012 (95%CI), so gnubg's 864 trials is reporting it's result to more accuracy than the XG rollouts which are much longer. Other rollouts I have done suggest that XG needs approximately threefold trials to achieve similar reported accuracy to gnubg. You can see evidence of that claim in these trials.

Messages In This Thread

 

Post Response

Your Name:
Your E-Mail Address:
Subject:
Message:

If necessary, enter your password below:

Password:

 

 

[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums is maintained by Stick with WebBBS 5.12.