[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums

Eval/rollout comparisons

Posted By: Nack Ballard
Date: Saturday, 21 August 2010, at 9:01 a.m.

In Response To: 42P-52S-66B-55, and 51S-52S-21@-66B-55 (variant) (Timothy Chow)

Nack, that's an interesting tip. I can see that investing time learning correlations between 4-ply and rollout might pay off in the long run. However, in the short term, if you're saying I need to extend the rollout of the original position in order to make accurate predictions about the new position, then I might as well spend that rollout time directly on the variant.

That's not really what I'm saying. What I am mainly trying to convey is that it is useful to compare the evals to the rollouts in order to improve your rollout-result predicting ability. One of the benefits of this skill is that you can sometimes save the CPU time of doing a variant rollout entirely because you can already be certain or fairly certain of the answer. The closer the configuration of your variant is to that of your original position, the more this is true.

For example, we know the rollout result of your original position is [A P55]. If you were to check now and see that the evaluation (in this case 4-ply is the standard you chose) of that position is something like [A P3], you know that GnuBG (or whatever bot) is relatively underevaluating the A play by 52 thousandths (i.e., .052).

Now, you already punched up the GnuBG eval of the variant and know it is [A P1]. If you assume that Gnu's bias of 52 holds in the variant position (as it did in the original position), or even that part of that bias holds, then you can be reasonably certain that A is the best play. If that's the information you're interested in, then you would be better off spending your CPU time elsewhere.

In other words, you're not utilizing the full information available to you if you punch up only the eval of the variant position, see [A P1], think, "gosh that's close" and conclude that the only way you'll have a clue which play is right is if you roll it out.

In no way do I want to discourage you from rolling out this variant or any other. I'm just offering a methodology that might save you time. Moreover, the eval vs result observations and comparisons will benefit you in other ways, as I mentioned earlier.

At the same time, I want you and others to be aware that the knowledge gained from early game rollouts of only 1k trials can be frustratingly limited, because you typically have to account for a CI range of .06 in the margin between two plays (and at that the bots seem to be underestimating). The range of the [A P55] result you just posted is roughly from [A P25] to [A P85]. That is, 1k trials is enough to determine the best play in this case (and if that's all you care about, fine), but you're on extremely shaky ground when you compare a 1k result to the margin result of a similar position of 1k trials and try to draw meaningful conclusions.

I realize that you're dealing with limited CPU time/power and I sympathize, but sometimes it's better to dig a few deep wells than many shallow ones.

Nack

Messages In This Thread

 

Post Response

Your Name:
Your E-Mail Address:
Subject:
Message:

If necessary, enter your password below:

Password:

 

 

[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums is maintained by Stick with WebBBS 5.12.