Sooner or later in life, you will need to generate a couple normal (or Gaussian) variables. Maybe you’re simulating some thermal noise or the stock market. Perhaps you’re sampling a point process with Gaussian coordinates. Or you’re running a generative artificial intelligence (Gen AI) model, such as a variational autoencoder, which needs to select a Gaussian point on an abstract mathematical space.
Whatever the reason, you can get your independent normal variables by applying the Box-Muller transform to independent uniform random variables. All you need to do is change from Cartesian to polar coordinates. Easy peasy, lemon squeezy.
Box-Muller was not the favourite
I previously wrote about the Box-Muller method, ending on the note that the method, though very cool, isn’t used so much as it relies upon mathematical functions that have been computationally expensive for processors, at least historically.
Box-Muller the favourite on GPUs
But that conventional wisdom has changed in recent years as processors can now (on a hardware level) readily evaluate such functions. More significantly, the fast rise of graphically processing units (GPUs) has also changed the rules of the game.
In a comment on my original Box-Muller post, it was pointed out that the Box-Muller method is the preferred choice for GPUs, giving a link to the Nvidia website. The reason for using the Box-Muller method is GPUs do not handle well algorithms with loops and branches, so you should use methods that avoid these algorithmic steps. And the Box-Muller method is one that does just that.
The NVDIA website says:
Because GPUs are so sensitive to looping and branching, it turns out that the best choice for the Gaussian transform is actually the venerable Box-Muller transform…
Indeed.