A Monte Carlo Monad for Haskell

I’ve just uploaded two new packages to Hackage. The first, gsl-random, is a set of bindings to the random number generators and random number distributions that come as part of the GNU Scientific Library. The next package, monte-carlo, is a monad and monad transformer for performing computations that require a random number generator, and is based on gsl-random. This post will give you a taste of what the MC Monte Carlo monad can do.

Introduction

For those unfamiliar with Monte Carlo, the basic idea is to use randomness to compute a nonrandom quantity. You generate a sequence of random variables X with mean equal to some quantity you care about, and then form a confidence interval for the quantity based on the mean and standard deviation of the simulated values.

Here’s an example that will make things a little more concrete. Suppose you want to compute pi. Say you have the ability to generate random points in the unit box [-1,1]^2. The box has an area of 2^2 = 4. The unit disc, which is contained completely in the box, has an area of pi. Therefore, the probability of X landing inside the unit circle is pi/4. So, if we generate a whole bunch of Xs, we should expect about pi/4 of them to land inside the unit circle. This means we can get an estimate of pi by generating a bunch of random points, counting how many land inside the circle, and then multiplying by 4.

Of course, the more points we take, the more accurate our answer will be. We can even get an approximate confidence interval for the true value by computing the standard deviation of the values.

Using the `MC` Monad

The monte-carlo package provides a monad, called MC, for performing Monte Carlo computations. The package exports a function for generating values uniformly over an interval:

uniform :: Double -> Double -> MC Double

This function takes the lower and upper bounds, and produces a value in that range. We can use this function to generate a value in the unit box:

unitBox :: MC (Double,Double)
unitBox = liftM2 (,) (uniform (-1) 1) 
                     (uniform (-1) 1)

We will need a function to test if a point lies inside the unit circle:

inUnitCircle :: (Double,Double) -> Bool
inUnitCircle (x,y) = x*x + y*y <= 1

Here is a function to generate n points and count how many fall inside the circle, and then to compute an estimate and standard error for pi based on n samples:

computePi :: Int -> MC (Double,Double)
computePi n = do
    xs <- replicateM n unitBox
    let m  = length $ filter inUnitCircle xs
        p  = toDouble m / toDouble n
        se = sqrt (p * (1 - p) / toDouble n)
    return (4*p, 4*se)

  where
    toDouble = realToFrac . toInteger

Running the Simulation

To get a value out of the MC monad, we must provide it with a random number generator. To get a Mersenne Twister generator, we use the mt19937 function. Then, evaluate the result with evalMC.

Here’s an example:

main = let
    n       = 10000000
    seed    = 0
    (mu,se) = evalMC (computePi n) $ mt19937 seed
    delta   = 2.575*se
    (l,u)   = (mu-delta, mu+delta)
    in do
        printf "Estimate:                   %g\n" mu
        printf "99%% Confidence Interval:    (%g,%g)\n" l u

It takes my 2GHz Macbook a little under five seconds to run the simulation with ten million samples. Here are the results I get:

Estimate:                   3.1414584
99% Confidence Interval:    (3.1401211162675353,3.1427956837324644)

Pretty cool, eh?

You might be wondering why I didn’t just use MonadRandom? There are two reasons: first, I didn’t want to write routines to generate random variables from different distributions (Normal, Exponential, Poisson, etc.). The GSL provides these for me. Second, internally the MC monad is not using a pure random number generator. It only keeps one copy of the generator state, and modifies it every time it samples a value.

Introduction

Using the MC Monad

Running the Simulation

Using the `MC` Monad