Monte Carlo Integration

  • A uniquely different approach to integration is the Monte Carlo (MC) method.

Example: find the area of a circle

  • To do this, we define a square, whose area we know, that bounds the circle. So, let the side length of the square equal the diameter of the circle.

  • We conceptually throw darts randomly at the square.

  • The fraction of the darts that land in the circle multiplied by the area of the square is the area of the circle:

    $$A_{circle} = A_{square}\frac{n_{in}}{n_{tot}},$$ where $n_{in}$ is the number of throws that are inside the circle, and $n_{tot}$ is the total number of throws.

  • A throw consists of selecting a random point in the square. We use a random number generator to get a random x location and a random y location. Call these $r_x$ and $r_y$.

    • The radial distance from the center of the circle of this point is $r_r = \sqrt{r_x^2+r_y^2}$.
    • If $r_r<r$, then the point is inside the circle.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
r  = 1.0                          # circle radius
Asquare = (2*r)**2                     # area of square enclosing circle
n  = 1000000                      # number of tries
rx = np.random.rand(n)*2*r - r    # -1 to 1; x locations
ry = np.random.rand(n)*2*r - r    # -1 to 1; y locations

rr = np.sqrt(rx**2 + ry**2)      # radial location
nhits = len(np.where(rr<r)[0])   # number of "hits"
I = float(nhits)/n * Asquare     # integral of circle

Aexact = np.pi*r**2
print('A_mc    =', I)
print('A_exact =', Aexact)
A_mc    = 3.142156
A_exact = 3.141592653589793

Relative Error

  • The relative integration error of MC methods scales as

$$\epsilon_r \propto\frac{1}{\sqrt{n_{tot}}}.$$

  • Hence, if you want to decrease your error by a factor of 10, you need to increase the number of throws by a factor of 100.
  • If we plot $\log_{10}(\epsilon_r)$ versus $\log_{10}(n_{tot})$, we expect to get a line with slope $-1/2$. This is illustrated below.
i_n = np.zeros(n)
i_n[rr<r] = 1

itry  = np.arange(1,n+1)
Areas = np.cumsum(i_n)/itry*Asquare
errs  = np.abs((Areas-Aexact)/Aexact)

plt.rc('font', size=14)
plt.plot(np.log10(itry), np.log10(errs));
plt.plot(np.array([0, np.log10(n)]), 
         np.array([np.log10(errs[0]), np.log10(errs[0])-0.5*np.log10(n)]));

plt.xlabel(r'$\log_{10}(n_{tot})$');
plt.ylabel(r'$\log_{10}(\epsilon_r)$');
plt.legend(['Relative error', 'Slope=-1/2'], frameon=False);

Visual Illustration

r  = 1.0                          # circle radius
As = (2*r)**2                     # area of square enclosing circle
n  = 10000                         # number of tries
rx = np.random.rand(n)*2*r - r    # -1 to 1; x locations
ry = np.random.rand(n)*2*r - r    # -1 to 1; y locations

rr = np.sqrt(rx**2 + ry**2)      # radial location
plt.plot(rx[rr<r], ry[rr<r], '.', markersize=1);
plt.plot(rx[rr>r], ry[rr>r], 'rx', markersize=1);
plt.axis('equal');

Generic function

  • To integrate a generic function:

$$I = \int_a^bf(x)dx,$$

we define a rectagle whose base is $[a,,b]$, and whose height is larger than the maximum value of $f$ on $[a,,b]$.

  • We through darts in this rectangle by sampling random $(x,,y)$ points in the rectangle.
  • For each $x$ we evaluate $f(x)$.
  • If $y\le f(x)$, then we call the point a hit.
  • The integral is then

$$I \approx A_{rectangle}\frac{n_{hits}}{n_{tot}}.$$

Example

  • Fit a fifth order polynomial to ten random points. Then integrate this polynomial.
x = np.sort(np.random.rand(10))
y = np.random.rand(10)
p = np.polyfit(x,y,5)
x = np.linspace(0,1,1000)
y = np.polyval(p,x)
yshift = np.abs(np.min(y))
y += yshift

n = 10000
rx = np.random.rand(n)
ry = np.random.rand(n)*np.max(y)
fx = np.polyval(p,rx) + yshift

yhit  = ry[ry<=fx]
xhit  = rx[ry<=fx]
ymiss = ry[ry>fx]
xmiss = rx[ry>fx]

plt.plot(xhit,  yhit,  'b.', markersize=2)
plt.plot(xmiss, ymiss, 'r.', markersize=2)
plt.plot(x,y, linewidth=5, color='black')
plt.ylim([0,1.1*np.max(y)])

Arectangle = 1*np.max(y)
I = Arectangle*len(xhit)/n

Ie = p[0]/6 + p[1]/5 + p[2]/4 + p[3]/3 + p[4]/2 + p[5] + yshift

print("I_mc    = ", I)
print("I_exact = ", Ie)
I_mc    =  1.140563269736279
I_exact =  1.1432022133192445

Uses

  • Monte Carlo methods are particularly useful for integrating functions in high dimensions.

  • For example, in combustion systems we often deal with hundreds of chemical species.

    • In most turbulent simulations, one cannot resolve all of the turbulent flow structures.
    • These unresolved structures are called subgrid.
    • An average value of, say, a reaction rate for a given species in a cell that contains subgrid variations is desired. Such an average is computed as

    $$\langle\omega_i(\vec{Y})\rangle = \int_{\vec{Y}}P(\vec{Y})\omega_i(\vec{Y})d\vec{Y}.$$

    • Here, $\omega_i$ is the reaction rate of species $i$, which is a function of the full chemical state vector $\vec{Y_i}$, which includes all species concentrations, temperature, and pressure. The angle brackets $\langle\cdot\rangle$ denote an average.
    • $P$ is the multi-dimensional joint probability density function which can be thought of as the fraction of the subgrid volume occupied by a particular value of the chemical state vector (part of the subgrid may be stoichiometric, other parts pure fuel, other parts pure air, etc.).
  • The integral is over the full multi-dimensional state space.

    • If we had 48 species, then with temperature and pressure, we would have a 50 dimensional integral.
  • Such an integral cannot be computed using a 50 dimensional grid. Suppose we discretized each dimension with ten points. Then we would have $10^{50}$ total points to evaluate the function at, which is impossible.

  • Monte-Carlo methods reduce such calculations to manageable levels.