| |
BGonline.org Forums
The significance of 3.679 (long)
Posted By: neilkaz In Response To: The significance of 5.187 (David Startin)
Date: Sunday, 20 July 2008, at 3:45 p.m.
My extensive work (not competely finished yet) with rollouts from Snowie 4 has Stick's ER at 3.679 (mine being 3.575).
Here's what I am doing here to use Snowie to get a better idea of what plays are best and by how much.
1) I set Snowie to batch rollout all errors (using default of .030) at 3 ply precise cubeless 216 games and truncated at 7. Clearly this should result in a more accurate evaluation than just 3 ply precise !
2) Cube decision errors are also initially put through the same cubeless rollouts. (Later on, I may decide to run a full cubeful rollout for some of them, but as checker and cube play isn't according to score, I don't trust live cube for lopsided scores)
3) Of course prior to the evaluation I go to Batch/Specify Anal/Misc and make sure to UNCHECK race checker play always at 1-ply as races count in ER, and 1 ply eval of them can generate nonsense, like erroring Stick for an obviously correct play in a gammon save situation.
4) I then step through the match manually and nearly every play that's close, gets put in batch for 216 games 3 ply trunc 7 rollouts. I may change a couple to different settings or do a full or cubeful rollout at this time. Plays and cubes that we thought lots about may also get rolled out even if Snowie thinks the play made was rather clear (ie correct by .05 or so).
5) Once these initial rollouts are done, I again carefully step thru the match, and extend some rollouts (statistical confidence intervals too close) or I just want more confidence in the correct play, and likely may put some cube decisions and even a couple checker plays through full cubeful rollouts. Sometimes I have knowledge (forum postings) that GNU rollouts(often a better tool for match anal, due to checker and cube play at score and what I believe is a superior MET (G11))may show a significantly different result than my Snowie truncated ones, so I typically then do a full and longish Snowie rolloutof those decisions.
Snowie rollouts take a long time, so I have about a week's more work on the Stick match and then I'll get it to Hardy so he can make it available to all. My purpose in going over my very important matches in such detail is for several reasons.
1) This forces me to really learn and see how well I was playing and to remember both my good plays and reasons for them as well and my errors, and heaven forbid, blunders.
2) The deeply rolled out match should clearly, IMHO present considerably more accurate ER's to show just how well the two gladiators played. Note that sometimes ER's get worse when you roll them out and sometimes your ER gets worse when you roll out close plays.. ie Snowie eval said you were wrong by .015 but the long rollout says wrong by .055, or sometimes Snowie eval says you were correct by .020 on a critical play but rollout makes it an error (worse than -.030 )
3) It is my hope that the rolled out match will give those students of the game with Snowie 4 and clearly more accurate evaluation to study from.
Why am I not doing this with GNU instead, since I consider it to be a more accurate tool for match analysis ? Well, Stick is doing a major project with GNU for this match, and he knows how to use GNU better than I do, and I'm sure he'll give me the final product in a month or two when he's done.
| |
BGonline.org Forums is maintained by Stick with WebBBS 5.12.