The inverse method for simulating random variables

We will cover a simple but much used method for simulating random variables or, rather, random variates. Although the material here is found in introductory probability courses, it frequently forms the foundation of more advance stochastic simulation techniques, such as Markov chain Monte Carlo methods.


The basics of probability theory tell us that any random variable can, in theory, be written as a function of a uniform random variable \(U\) distributed on the interval \((0,1)\), which is usually written as \(U\sim U(0,1)\). All one needs is the inverse of the cumulative distribution function of the desired random variable.

More specifically, let \(X\) be a random variable with a cumulative distribution function \(F(x)=\mathbb{P}(X\leq x)\). The function \(F\) is nondecreasing in \(x\), so its inverse can be defined as \(F^{-1}(y)=\inf\{x:F(x)\geq y\}\), which is known as the generalized inverse of \(F(x)\).

Some authors assume the minimum is attained so the infimum is replaced with the minimum, giving \(F^{-1}(y)=\min\{x:F(x)\geq y\}\).

In short, the following result is all that we need.

Transform of a uniform variable \(U\sim U(0,1)\)

For a uniform random variable \(U\sim U(0,1)\), the random variable \(F^{-1}(U)\) has the cumulative distribution function \(\mathbb{P}(F^{-1}(U)\leq x)=P(U\leq F(x))=F(x)\).


The above observation gives a method, which I like to call the direct method, for exactly simulating a random variable \(X\) with the (cumulative) distribution (function) \(F\).

Random variable \(X\) with distribution \(F\)

  1. Sample a uniform random variable \(U\sim U(0,1)\), giving a value \(u\).
  2. Return the value \(x=F^{-1}(u)\) as the sampled value of \(U\).

But this approach only works if we can write down (in a relatively straightforward way) the inverse \(F^{-1}\), which is usually not the case. This means you cannot generate, for example, simulate a normal random variable with a single uniform random variable by using just the inverse method, as we cannot write down the inverse of its cumulative distribution function.

(Interestingly, with two (independent) uniform random variables, we can use the transform method to simulate two (independent) normal (or Gaussian) random variables. This approach is called the Box-Muller transform, which I’ll cover in another post.)

Nevertheless, we can apply the inverse method to some useful distributions.


Warning: The following examples are only for illustration purposes. Except for the Bernoulli example, you would never use them in standard scientific languages such as MATLAB, Python (with NumPy), R or Julia, because those languages already have much better functions for simulating these and many other random variables (or variates). If you are writing a function in a language that lacks such functions, I would consult one of the references mentioned below. Although the inverse method is usually intuitive and elegant, it is often not the fastest method.

Bernoulli distribution

The simplest random variable is that with the Bernoulli distribution. With probability \(p\), a Bernoulli random variable \(X\) takes the value one. Otherwise, \(X\) takes the value zero (with probability \(1-p\)). This gives the (cumulative) distribution (function):

$$ F_B(x)=\begin{cases}
0 & \text{if } x < 0 \\
1 – p & \text{if } 0 \leq x < 1 \\
1 & \text{if } x \geq 1

This gives a very simple way to simulate (or sample) a Bernoulli variable \(X\) with parameter \(p\).

Bernoulli random variable \(X\) with parameter \(p\)

  1. Sample a uniform random variable \(U\sim U(0,1)\), giving a value \(u\).
  2. If \(u\leq p\), return \(x=1\); otherwise return \(x=0\).
Application: Acceptance simulation methods

In random simulation code, whenever you do something (or not) with some probability \(p\) (or probability \(1-p\)), then the code will perform the above step. Consequently, you see this in the (pseudo-)code of many stochastic simulations with random binary choices, particularly schemes that have an acceptance step such the Metropolis-Hastings method and other Markov chain Monte Carlo (MCMC) methods.

In MCMC schemes, a random (binary) choice is proposed and it is accepted with a certain probability, say, \(\alpha \). This is the equivalent of accepting the proposed choice if some uniform random variable \(U\) meets the condition \(U\leq \alpha\).

This explains why the pseudo-code of the same algorithm can vary. Some pseudo-code will say accept with probability \(\alpha\), while other pseudo-code will say do if \(U\leq \alpha\). It’s two equivalent formulations.

Exponential distribution

The cumulative distribution function of an exponential variable with mean \(1/\lambda\) is \(F_E(x)= 1-e^{-\lambda x}\), which has the inverse \(F^{-1}_E(y)=-(1/\mu)\ln[1-y]\). We can use the fact that on the interval \((0,1)\), a uniform variable \(U\sim U(0,1)\) and \(1-U\) have the same distribution. Consequently, the random variables \(\ln [1-U]\) and \(\ln U\) are equal in distribution.

This gives a method for simulating exponential random variables.

Exponential random variable \(X\) with mean \(1/\lambda\)

  1. Sample a uniform random variable \(U\sim U(0,1)\), giving a value \(u\).
  2. Return \(x=-(1/\lambda)\ln u\).
Application: Poisson simulation method

Of course you can use this method to simulate exponential random variables, but it has another application. In a previous post on simulating Poisson variables, I mentioned that exponential random variables can be used to simulate a Poisson random variable in a direct (or exact) manner. That method is based on the distances between the points of a homogeneous Poisson point process (on the real line) being exponential random variables.

But this method is only suitable for low values of \(\lambda\), less than say fifteen.

Rayleigh distribution

The Rayleigh distribution is \(\mathbb{P}(X\leq x)= (x/\sigma^2)e^{-x^2/(2\sigma^2)}\), where \(\sigma>0\) is its scale parameter. The square root of an exponential variable with mean \(1/\lambda\) has a Rayleigh distribution with scale parameter \(\sigma=1/\sqrt{2\lambda}\).

Consequently, the generation method is similar to the previous example.

Rayleigh random variable \(Y\) with scale parameter \(\sigma>0\)

  1. Sample a uniform random variable \(U\sim U(0,1)\), giving a value \(u\).
  2. Return \(y=\sigma\sqrt{-2\ln u}\).

Other methods

The inverse method is intuitive and often succinct. But most functions for simulating random variables (or, more correctly, generating random variates) do not use these methods, as they are not fast under certain parameter regimes, such as large means. Consequently, other method are used such as approximations (with, say, normal random variables), such as the ones I outlined in this post on simulating Poisson random variables.

More complicated random systems, such as collections of dependent variables, can be simulated using Markov chain Monte Carlo methods, which is the direction we’ll take in a couple posts after this one.

Further reading

The inverse technique is in your favourite introductory book on probability theory. The specific examples here are covered in books on stochastic simulations and Monte Carlo methods. The classic book by Devroye covers these topics; see Section 2.1 and the examples (inverse method) in Chapter 2.

For a modern take, there’s the extensive Handbook of Monte Carlo Methods by Kroese, Taimre and Botev; see Section 3.1.1, Algorithm 4.1 (Bernoulli) and Algorithm 4.29 (exponential), and Algorithm 4.66 (Rayleigh). There’s also the book Stochastic Simulation: Algorithms and Analysis by Asmussen and Gynn; in Chapter 2, see Example 2.1 (Bernoulli) and Exampe 2.3 (exponential).

Other books include those by Fishman (Section 8.1) and Gentle (Section 4.1) respectively. (Warning: the book by Gentle has a mistake on page 105 in algorithm for sampling Bernoulli variables, as noted by the author. It should be \(1-\pi\) and not \(\pi\) when zero is returned for the sampled value of the Bernoulli variable.)