[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums

Calculating confidence values

Posted By: Timothy Chow
Date: Sunday, 30 August 2009, at 12:47 a.m.

Buried deep in the thread "eXtreme Gammon question" is a debate about the right way to calculate confidence values. I've been asked to clarify some statements I made. I thought this question might be of wider interest, and that old threads seem not to be read by as many people, so I've started a new thread.

Executive summary: There is way of calculating confidence values that is different from the way most bots currently seem to calculate them, which I believe is better because it more closely fits what we want from those values.

For simplicity I will assume throughout that we are at DMP and that the bots play perfectly. I will also ignore variance reduction. These assumptions do not materially affect the points I'm making, but they eliminate some annoying technicalities.

Based on what others such as Bob Koca have told me, here's how I believe bots come up with confidence values. Let's say we're examining two ways to play a roll, Play A and Play B. We generate a random series of dice rolls and play them out both ways. If for example Play A wins while Play B loses, then we estimate the equity difference A – B to be 1 – (–1) = 2. We repeat with another set of dice rolls, etc., accumulating samples of the equity difference. Using the sample standard deviation to estimate the actual standard deviation of the equity difference, we are then able to calculate the probability that we would observe as large an equity difference as we do, assuming that the true equity difference were zero. If this probability is 5%, then we declare 95% confidence.

The problem I have with this approach is, being 95% confident that we are correctly rejecting the hypothesis that Play A and Play B are equally good (with the equity suggested by our samples) is not the same thing as being 95% confident that the play we declare to be better is really the better play. And the latter is what we're really trying to get at.

Therefore I propose a more Bayesian approach. (I initially assumed that this is how the bots were already doing things, but apparently I was wrong.) Here's one way to do it. Create two arrays A[0], . . ., A[1000] and B[0], . . ., B[1000], where A[i] represents the probability that the MWC of Play A is i/1000 (and similarly for B). We must initialize these arrays. The natural default value is 1/1001, i.e., the uniform distribution, which captures the notion that before making any observations, we have no reason to believe that the equity is more likely to be one value than another. Now we randomly generate a sequence of dice rolls and observe the result. We now need to update the values of the A and B arrays. For example, if Play A resulted in a win, then for each i, we multiply the value of A[i] by i/1000 (because this is the probability of getting a win if the MWC really were i/1000). At the end, we normalize the A[i] by dividing through by a constant so that they still sum to 1. Similarly, if for example Play B resulted in a loss, then we would multiply each B[i] by 1 – i/1000, and normalize.

When we're done, we now have the numbers to answer most of the questions we want to ask. If we want to know what the probability is that Play A is strictly better than Play B, then we simply sum up A[i]*B[j] for all i > j. If this is 95% then Play A is better than Play B with 95% probability.

The caveat, as usual in Bayesian approaches, is that our statement of 95% probability is conditional on our initial assumption that all MWC values were equally likely. This assumption is not a strictly mathematical assumption and so you can argue with it. But the beauty of the approach is that if you don't like the choice of uniform prior, you can replace it with something else that you're happier with. For example, if you're confident that Play A has MWC at least 0.3, then you can zero out A[i] for i < 300. Any prior information you have about the MWC can be incorporated into the initial values of A[i] and B[i], and gets automatically incorporated into the probability calculations.

It is also trivial to extend this methodology to 3 or more plays. Just keep an array around for each play. If you want to know the probability that Play C is better than Plays A and B, then just sum A[i]*B[j]*C[k] over all i, j, k such that k > i and k > j.

Messages In This Thread

 

Post Response

Your Name:
Your E-Mail Address:
Subject:
Message:

If necessary, enter your password below:

Password:

 

 

[ View Thread ] [ Post Response ] [ Return to Index ] [ Read Prev Msg ] [ Read Next Msg ]

BGonline.org Forums is maintained by Stick with WebBBS 5.12.