Jacobian – H. Paul Keeler

In the previous post, I covered a simple but much used method for simulating random variables or, rather, generating random variates. To simulate a random variable, the method requires writing down, in a tractable manner, the inverse of its cumulative distribution function.

But in the case of the normal (or Gaussian) distribution, there is no closed-form expression for its cumulative distribution function nor its inverse. This means you cannot, in an elegant and fast way at least, generate with the inverse method a single normal random variable using a single uniform random variable.

Interestingly, however, you can generate two (independent) normal variables with two (independent) uniform variables using the Box-Muller method, originally proposed by George Box and Mervin E. Muller. This approach uses the inverse method, but in practice it’s not used much (see below). I detail this method because I find it neat and it highlights the connection between the normal distribution and rotational symmetry, which has been the subject of some recent 3Blue1Brown videos on YouTube.

(This method was also used to simulate the Thomas point process, which I covered in a previous post.)

Incidentally, this connection is also mentioned in a previous post on simulating a Poisson point process on the surface of a sphere. In that method post, Method 2 uses an observation by the Muller that normal random variables can be used to position points uniformly on spheres.

I imagine this method was first observed by transforming two normal variables, instead of guessing various distribution pairs that would work. Then I’ll sketch the proof in the opposite direction, though it works in both directions.

Proof outline

The joint probability density of two independent variables is simply the product of the two individual probabilities densities. Then the joint density of two standard normal variables is

$$\begin{align}f_{X,Y}(x,y)&=\left[\frac{1}{\sqrt{2\pi}}e^{-x^2/2}\right]\left[\frac{1}{\sqrt{2\pi}}e^{-y^2/2}\right]\\&=\frac{1}{{2\pi}}e^{-(x^2+y^2)/2}\,.\end{align}$$

Now it requires a change of coordinates in two dimensions (from Cartesian to polar) using a Jacobian determinant, which in this case is $|J(\theta,r)=r|$.¹ giving a new joint probability density

$$f_{\Theta,R}(\theta,r)=\left[\frac{1}{\sqrt{2\pi}}\right]\left[ r\,e^{-r^2/2}\right]\,.$$

Now we just identify the two probability densities. The first probability density corresponds to a uniform variable on $[0, 2\pi]$, whereas the second is that of a Rayleigh variable with parameter $\sigma=1$. Of course the proof works in the opposite direction because the transformation (between Cartesian and polar coordinates) is a one-to-one function.

Algorithm

Here’s the Box-Muller method for simulating two (independent) standard normal variables with two (independent) uniform random variables.

Two (independent) standard normal random variable $Z_1$ and $Z_2$

Generate two (independent) uniform random variables $U_1\sim U(0,1)$ and $U_2\sim U(0,1)$.

Return $Z_1=\sqrt{-2\ln U_1}\cos(2\pi U_2)$ and $Z_2=\sqrt{-2\ln U_1}\sin(2\pi U_2)$.

The method effectively samples a uniform angular variable $\Theta=2\pi U_2$ on the interval $[0,2\pi]$ and a radial variable $R=\sqrt{-2\ln U_1}$ with a Rayleigh distribution.

The algorithm produces two independent standard normal variables. Of course, as many of us learn in high school, if $Z$ is a standard normal variable, then the random variable $X=\sigma Z +\mu$ is a normal variable with mean $\mu$ and standard deviation $\sigma>0$ .

The fall of the Box-Muller method

Sadly this method was typically not used, as historically computer processors were slow at doing calculations involving the necessary mathematical functions. To avoid these functions researchers developed and employed other methods such as the ziggurat algorithm.

Also, although processors can now do such calculations much faster, many languages, not just scientific ones, come with functions for generating normal variables. Consequently, there had not been much need in implementing this method.

Update: The return of the Box-Muller method

The above conventional wisdom has changed in recent years as processors can now (on a hardware level) readily evaluate such functions. (I had been waiting to see if certain libraries would be re-written by using the Box-Muller method, but why bother if the old ones work so well?) When I used the term “processors”, I had central processor units (CPUs) in mind, but in recent years graphically processor units (GPUs) have become widely popular.

In a comment on this post, it pointed out that the Box-Muller method is the preferred choice for GPUs, as evidenced by its implementation in Nvidia’s CUDA library. The reason is GPUs do not handle well loops and branches in algorithms, so you should use methods that avoid these algorithmic steps. And the Box-Muller method is one that does just that.

The NVDIA website says:

Because GPUs are so sensitive to looping and branching, it turns out that the best choice for the Gaussian transform is actually the venerable Box-Muller transform

Tag: Jacobian

The Box-Muller method for simulating normal variables

Proof outline

Algorithm

The fall of the Box-Muller method

Update: The return of the Box-Muller method

Further reading

Websites

Papers

Books