Testing the Julia language with point process simulations

I started writing these posts (or blog entries) about a year ago. In my first post I remarked how I wanted to learn to write stochastic simulations in a new language. Well, I found one. It’s called Julia. Here’s my code. And here are my thoughts.

Overview

For scientific programming, the Julia language has arisen as a new contender. Originally started in 2012, its founders and developers have (very) high aspirations, wanting the language to be powerful and accessible, while still having run speeds comparable to C. There’s been excitement about it, and even a Nobel Laureate in economics, Thomas Sargent, has endorsed it. He co-founded the QuantEcon project, whose website has this handy guide or cheat sheet for commands between MATLAB, Python and Julia.

That guide suggests that Julia’s main syntax inspiration comes from MATLAB. But perhaps its closest (and greatest) competitor in scientific programming languages is Python, which has become a standard language used in scientific programming, particularly in machine learning. Another competitor is the statistics language R, which is popular for data science. But R is not renown for its speed.

As an aside, machine learning is closely related to what many call data science. I consider the two disciplines as largely overlapping with statistics, where their respective emphases are on theory and practice. In these fields, often the languages Python and R are used. There are various websites discussing which language is better, such as this one, which in turn is based on this one. In general, it appears that computer scientists and statisticians respectively prefer using Python and R.

Returning to the Julia language, given its young age, the language is still very much evolving, but I managed to find suitable Julia functions for stochastic simulations. I thought I would try it out by simulating some point processes, which I have done several times before. I successfully ran all my code with Julia Version 1.0.3.

In short, I managed to replicate in (or even translate to) Julia the code that I presented in the following posts:

Simulating a homogeneous Poisson point process on a rectangle

Simulating a Poisson point process on a disk

Simulating a Poisson point process on a triangle

Simulating an inhomogeneous Poisson point process

Simulating a Matérn cluster point process

Simulating a Thomas cluster point process

The Julia code, like all the code I present here, can be found on my Github repository, which for this post is located here.

Basics

Language type and syntax

The Wikipedia article on Julia says:

Julia is a high-level general-purpose dynamic programming language designed for high-performance numerical analysis and computational science.

Scientific programming languages like the popular three MATLAB, R and Python, are interpreted languages. But the people behind Julia say:

it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages.

Because Julia’s compiler is different from the interpreters used for languages like Python or R, you may find that Julia’s performance is unintuitive at first.

I already remarked that Julia’s syntax is clearly inspired by MATLAB, as one can see in this guide for MATLAB, Python and Julia. But there are key differences. For example, to access an array entry in Julia, you use square brackets (like in most programming languages), whereas parentheses are used in MATLAB.

Packages

Julia requires you to install certain packages or libraries, like most languages. For random simulations and plots, you have to install the Julia packages Distributions and Plots, which is done by running the code.

Pkg.add("Distributions");
Pkg.add("Plots");

After doing that, it’s best to restart Julia. These packages are loaded with the using command:

Using Distributions;
Using Plots;

Also, the first time it takes a while to run any code using those newly installed packages.

I should stress that there are different plotting libraries, but Plots, which contains many plotting libraries, is the only one I could get working. Another is PlotPy, which uses the Python library. As a beginner, it seems to me that the Julia community has not focused too much on developing new plotting functions, and has instead leveraged pre-existing libraries.

For standard scientific and statistical programming, you will usually also need the packages LinearAlgebra and Statistics.

Data types

Unlike MATLAB or R, Julia is a language that has different data types for numbers, such as integers and floating-point numbers (or floats). This puts Julia in agreement with the clear majority of languages, making it nothing new for most programmers. This is not a criticism of the language, but this can be troublesome if you’ve grown lazy after years of using MATLAB and R.

Simulating random variables

In MATLAB, R and Python, we just need to call a function for simulating uniform, Poisson, and other random variables. There’s usually a function for each type of random variable (or probability distribution).

Julia does simulation of random objects in a more, let’s say, object-oriented way (but I’m told, it’s not an object-oriented language). The probability distributions of random variables are objects, which are created and then sent to a general function for random generation. For example, here’s the code for simulating a Poisson variable with mean \(\mu=10\).

mu=10;
distPoisson=Poisson(mu);
numbPoisson=rand(distPoisson);

Similarly, here’s how to simulate a normal variable with mean \(\mu=10\) and standard deviation \(\sigma=1\).

sigma=1;
distNormal=Normal(mu,sigma);
numbNormal=rand(distNormal);

Of course the last two lines can be collapsed into one.

mu=10;
sigma=1;
numbNormal=rand(Normal(mu,sigma));

But if you just want to create standard uniform variables on the interval (0,1), then the code is like that in MATLAB. For example, this code creates a \(4\times3\) matrix (or array) \(X\) whose entries are simulation outcomes of independent uniform random variables:

X=rand(4,3);

The resulting matrix \(X\) is a Float 64 array.

Arrays

The indexing of arrays in Julia starts at one, like MATLAB and R. When you apply a function to an array, you generally need to use the dot notation. For example, if I try to run the code:

Y=sqrt(rand(10,1)); #This line will result in an error.

then on my machine (with Julia Version 1.0.3) I get the error:

ERROR: DimensionMismatch(“matrix is not square: dimensions are (10, 1)”)

But this code works:

Y=sqrt.(rand(10,1));

Also, adding scalars to arrays can catch you in Julia, as you also often need to use the dot notation. This code:

Y=sqrt.(rand(10,1));
Z=Y+1; #This line will result in an error.

gives the error:

ERROR: MethodError: no method matching +(::Array{Float64,2}, ::Int64)

This is fixed by adding a dot:

Y=sqrt.(rand(10,1));
Z=Y.+1; #This line will work.

Note the dot has to be on the left hand side. I ended up just using dot notation every time to be safe.

Other traps exist. For example, with indexing, you need to convert floats to integers if you want to use them as indices.

Repeating array elements

There used to be a Julia function called repmat, like the one in MATLAB , but it was merged with a function called repeat. I used such repeating operations to avoid explicit for-loops, which is generally advised in languages like MATLAB and R. For example, I used the repelem function in MATLAB to simulate Matérn and Thomas cluster point processes. To do this in Julia, I had to use this nested construction:

y=vcat(fill.(x, n)...);

This line means that the first value in \(x \) is repeated \(n[1]\) times, where \(n[1]\) is the first entry of \(n\) (as indexing in Julia starts at one), then the second value of \(x\) is repeated \(n[2]\) times, and so on. For example, the vectors \(x=[7,4,9]\) and \(n=[2,1,3]\), the answer is \(y=[7,7,4,9,9,9]\).

To do this in Julia, the construction is not so bad, if you know how, but it’s not entirely obvious. In MATLAB I use this:

y=repelem(x,n);

Similarly in Python:

y=np.repeat(x,n);
Different versions of Julia

I found that certain code would work (or not work) and then later the same code would not work (or would work) on machines with different versions of Julia, demonstrating how the language is still being developed. More specifically, I ran code on Julia Version 1.0.3 (Date 2018-12-18) and Julia Version 0.6.4 (Date: 2018-07-09). (Note how there’s only a few months difference in the dates of the two versions.)

Consider the code with the errors (due to the lack of dot operator) in the previous section. The errors occurred on one machine with Julia Version 1.0.3, but the errors didn’t occur on another machine with the older Julia Version 0.6.4. For a specific example, the code:

Y=sqrt.(rand(10,1));
Z=Y+1; #This line will not result in an error on Version 0.6.4.

gives no error with Julia Version 0.6.4, while I have already discussed how it gives an error with Julia Version 1.0.3.

For another example, I copied from this MATLAB-Python-Julia guide the following command:

A = Diagonal([1,2,3]); #This line will (sometimes?) result in an error.

It runs on machine with Julia Version 0.6.4 with no problems. But running it on the machine with Julia Version 1.0.3 gives the error:

ERROR: UndefVarError: Diagonal not defined

That’s because I have not used the LinearAlgebra package. Fixing this, the following code:

using LinearAlgebra; #Package needed for Diagonal command.
A = Diagonal([1,2,3]); #This line should now work.

gives no error with Julia Version 1.0.3.

If you have the time and energy, you can search the internet and find online forums where the Julia developers have discussed why they have changed something, rendering certain code unworkable with the latest versions of Julia.

Optimization

It seems that performing optimization on functions is done with the Optim package.

Pkg.add("Optim");

But some functions need the Linesearches package, so it’s best to install that as well.

Pkg.add("Linesearches");

Despite those two optimization packages, I ended up using yet another package called BlackBoxOptim.

Pkg.add("BlackBoxOptim");

In this package, I used a function called bboptimize. This is the first optimziation function that I managed to get working. I do not know how it compares to the functions in the Optim and Linesearches packages.

In a previous post, I used optimization functions to simulate a inhomogeneous or nonhomogeneous Poisson point process on a rectangle. I’ve also written Julia code for this simulation, which is found below. I used bboptimize, but I had some problems when I initially set the search regions to integers, which the package did not like, as the values need to be floats. That’s why I multiple the rectangle dimensions by \(1.0\) in the following code:

boundSearch=[(1.0xMin,1.0xMax), (1.0yMin, 1.0yMax)]; #bounds for search box
#WARNING: Values of boundSearch cannot be integers!
resultsOpt=bboptimize(fun_Neg;SearchRange = boundSearch);
lambdaNegMin=best_fitness(resultsOpt); #retrieve minimum value found by bboptimize

Conclusion

In this brief experiment, I found the language Julia good for doing stochastic simulations, but too tricky for doing simple things like plotting. Also, depending on the version of Julia, sometimes my code would work and sometimes it wouldn’t. No doubt things will get better with time.

Further reading

As I said, Julia is still very much an ongoing project. Here’s a couple of links that helped me learn the basics.

https://en.wikibooks.org/wiki/Introducing_Julia/Arrays_and_tuples

https://voxeu.org/content/which-numerical-computing-language-best-julia-matlab-python-or-r

Julia, Matlab, and C

https://modelingguru.nasa.gov/docs/DOC-2676

Code

I’ve only posted here code for some of simulations, but the rest of the code is available on my GitHub repository located here. You can see how the code is comparable to that of MATLAB.

Poisson point process on a rectangle

I wrote about this point process here. The code is located here.

using Distributions #for random simulations
using Plots #for plotting

#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints=rand(Poisson(areaTotal*lambda)); #Poisson number of points
xx=xDelta*rand(numbPoints,1).+xMin;#x coordinates of Poisson points
yy=yDelta*(rand(numbPoints,1)).+yMin;#y coordinates of Poisson points

#Plotting
plot1=scatter(xx,yy,xlabel ="x",ylabel ="y", leg=false);
display(plot1);
Inhomogeneous Poisson point process on a rectangle

I wrote about this point process here. The code is located here.

using Distributions #for random simulations
using Plots #for plotting
using BlackBoxOptim #for blackbox optimizing

#Simulation window parameters
xMin=-1;xMax=1;
yMin=-1;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

s=0.5; #scale parameter

#Point process parameters
function fun_lambda(x,y)
    100*exp.(-(x.^2+y.^2)/s^2); #intensity function
end

###START -- find maximum lambda -- START ###
#For an intensity function lambda, given by function fun_lambda,
#finds the maximum of lambda in a rectangular region given by
#[xMin,xMax,yMin,yMax].
#NOTE: Need xMin, xMax, yMin, yMax to be floats eg xMax=1. See boundSearch

function fun_Neg(x)
     -fun_lambda(x[1],x[2]); #negative of lambda
end
xy0=[(xMin+xMax)/2.0,(yMin+yMax)/2.0];#initial value(ie centre)

#Find largest lambda value
boundSearch=[(1.0xMin,1.0xMax), (1.0yMin, 1.0yMax)];
#WARNING: Values of boundSearch cannot be integers!
resultsOpt=bboptimize(fun_Neg;SearchRange = boundSearch);
lambdaNegMin=best_fitness(resultsOpt); #retrieve minimum value found by bboptimize
lambdaMax=-lambdaNegMin;
###END -- find maximum lambda -- END ###

#define thinning probability function
function fun_p(x,y)
    fun_lambda(x,y)/lambdaMax;
end

#Simulate a Poisson point process
numbPoints=rand(Poisson(areaTotal*lambdaMax)); #Poisson number of points
xx=xDelta*rand(numbPoints,1).+xMin;#x coordinates of Poisson points
yy=yDelta*(rand(numbPoints,1)).+yMin;#y coordinates of Poisson points

#calculate spatially-dependent thinning probabilities
p=fun_p(xx,yy);
#Generate Bernoulli variables (ie coin flips) for thinning
booleRetained=rand(numbPoints,1).<p; #points to be retained
xxRetained=xx[booleRetained]; yyRetained=yy[booleRetained];

#Plotting
plot1=scatter(xxRetained,yyRetained,xlabel ="x",ylabel ="y", leg=false);
display(plot1);
Thomas point process on a rectangle

I wrote about this point process here. The code is located here.


using Distributions #for random simulations
using Plots #for plotting

#Simulation window parameters
xMin=-.5;
xMax=.5;
yMin=-.5;
yMax=.5;

#Parameters for the parent and daughter point processes
lambdaParent=10;#density of parent Poisson point process
lambdaDaughter=10;#mean number of points in each cluster
sigma=0.05; #sigma for normal variables (ie random locations) of daughters

#Extended simulation windows parameters
rExt=7*sigma; #extension parameter
#for rExt, use factor of deviation sigma eg 6 or 7
xMinExt=xMin-rExt;
xMaxExt=xMax+rExt;
yMinExt=yMin-rExt;
yMaxExt=yMax+rExt;
#rectangle dimensions
xDeltaExt=xMaxExt-xMinExt;
yDeltaExt=yMaxExt-yMinExt;
areaTotalExt=xDeltaExt*yDeltaExt; #area of extended rectangle

#Simulate Poisson point process
numbPointsParent=rand(Poisson(areaTotalExt*lambdaParent)); #Poisson number of points

#x and y coordinates of Poisson points for the parent
xxParent=xMinExt.+xDeltaExt*rand(numbPointsParent,1);
yyParent=yMinExt.+yDeltaExt*rand(numbPointsParent,1);

#Simulate Poisson point process for the daughters (ie final poiint process)
numbPointsDaughter=rand(Poisson(lambdaDaughter),numbPointsParent);
numbPoints=sum(numbPointsDaughter); #total number of points

#Generate the (relative) locations in Cartesian coordinates by
#simulating independent normal variables
xx0=rand(Normal(0,sigma),numbPoints);
yy0=rand(Normal(0,sigma),numbPoints);

#replicate parent points (ie centres of disks/clusters)
xx=vcat(fill.(xxParent, numbPointsDaughter)...);
yy=vcat(fill.(yyParent, numbPointsDaughter)...);

#Shift centre of disk to (xx0,yy0)
xx=xx.+xx0;
yy=yy.+yy0;

#thin points if outside the simulation window
booleInside=((xx.>=xMin).&(xx.<=xMax).&(yy.>=yMin).&(yy.<=yMax));
#retain points inside simulation window
xx=xx[booleInside];
yy=yy[booleInside];

#Plotting
plot1=scatter(xx,yy,xlabel ="x",ylabel ="y", leg=false);
display(plot1);

Simulating an inhomogeneous Poisson point process

In previous posts I described how to simulate homogeneous Poisson point processes on a rectangledisk and triangle. But here I will simulate an inhomogeneous or nonhomogeneous Poisson point process. (Both of these terms are used, where the latter is probably more popular, but I prefer the former.) For such a point process, the points are not uniformly located on the underlying mathematical space on which the Poisson process is defined. This means that certain regions  will, on average, tend to have more (or less) points than other regions of the underlying space.

Basics

Any Poisson point process is defined with a non-negative measure called the intensity or mean measure. I make the standard assumption that the intensity measure \(\Lambda\) has a derivative \(\lambda(x,y)\).  (I usually write a single \(x\) to denote a point on the plane, that is \(x\in \mathbb{R}^2\), but in this post I will write the \(x\) and \(y\) and coordinates separately.) The function \(\lambda(x,y)\) is often called the intensity function or just intensity, which I further assume is bounded, so \(\lambda(x,y)<\infty\) for all points in a simulation window \(W\). Finally, I assume that the simulation window \(W\) is a rectangle, but later I describe how to lift that assumption.

Number of points

To simulate a point process,  the number of points and the point locations  in the simulation window \(W\) are needed.  For any Poisson point process, the number of points is a Poisson random variable with a parameter (that is, a mean) \(\Lambda(W)\), which under our previous assumptions is given by the integral

$$\Lambda(W)=\int_W \lambda(x,y)dxdy. $$

Assuming we can evaluate such an integral analytically or numerically, then the number of points is clearly not difficult to simulate.

Locations of points

The difficulty lies in randomly positioning the points. But a defining property of the Poisson point process is its independence, which allows us to treat each point completely independently.  Positioning each point then comes down to suitably simulating two (or more) random variables for Poisson  point processes in two (or higher) dimensions.  Similarly, the standard methods used for simulating continuous random variables can be applied to simulating random point locations of a Poisson point process.

In theory, you can rescale the intensity function with the total measure of the simulation window, giving

$$f(x,y):=\frac{\lambda(x,y)}{\Lambda(W)}. $$

We can then interpret this rescaled intensity function \(f(x,y)\)  as the joint probability density of two random variables \(X\) and \(Y\), because it integrates to one,

$$\int_W f(x,y)dxdy=1.$$

Clearly the method for simulating an inhomogeneous Poisson point process depends on the nature of the intensity function. For the inhomogeneous case, the random variables \(X\) and \(Y\) are, in general, not independent.

Transformation

To simulate an inhomogeneous Poisson point process, one method is to first simulate a homogeneous one, and then suitably transform the points according to deterministic function. For simple random variables, this transformation method is quick and easy to implement, if we can invert the probability distribution. For example, a uniform random variable \(U\) defined on the interval \((0,1)\) can be used to give an exponential random variable by applying the transformation \(h(u)= -(1/\lambda)\log(u)\), where \(\lambda>0\), meaning \(h(U)\) is an exponential random variable with parameter \(\lambda>0\) (or mean \(1/\lambda\)).

Similarly for Poisson point processes, the transformation approach is fairly straightforward in a one-dimensional setting, but generally doesn’t work easily in two (or higher) dimensions. The reason being that often we cannot simulate the random variables \(X\) and \(Y\) independently, which means, in practice, we need first to simulate one random variable, then the other.

It is a bit easier if we can re-write the rescaled intensity function or joint probability density \(f(x,y)\) as a product of single-variable functions \(f_X(x)\) and \(f_Y(y)\), meaning the random variables \(X\) and \(Y\) are independent. We can then simulate independently the random variables \(X\) and \(Y\), corresponding to the \(x\) and \(y\) coordinates of the points. But this would still require integrating and inverting the functions.

Markov chain Monte Carlo

A now standard way to simulate jointly distributed random variables is to use  Markov chain Monte Carlo (MCMC), which we can also use to simulate the the \(X\) and \(Y\) random variables. Applying MCMC methods is simply applying random  point process operations repeatedly to all the points. But this is a bit too tricky and involved. Instead I’ll use a general yet simpler method based on thinning.

Thinning

The thinning method is the arguably the simplest and most general way to simulate an inhomogeneous Poisson point process. If you’re unfamiliar with thinning, I recommend my previous post on thinning and the material I cite.

This simulation method is simply a type of acceptance-rejection method for simulating random variables. More specifically, it is the acceptance-rejection or rejection method, attributed to the great John von Neumann, for simulating a continuous random variable, say \(X\), with some known probability density \(f(x)\). The method accepts/retains or rejects/thins the outcome of each random variable/point depending on the outcome of a uniform random variable associated with each random variable/point.

The thinning or acceptance-rejection method is also appealing because it is an example of a perfect simulation method, which means the distribution of the simulated random variables or points will not be an approximation. This can be contrasted with typical MCMC methods, which, in theory, reach the desired distribution of the random variables in infinite time, which is clearly not possible in practice.

Simulating the homogeneous Poisson point process

First simulate a homogeneous Poisson point process with intensity value \(\lambda^*\), which is an upper bound of the intensity function \(\lambda(x,y)\). The simulation step is the easy part, but what value is \(\lambda^*\)?

I will use the maximum value that the intensity function \(\lambda(x,y)\) takes, which I denote by

$$ \lambda_{\max}:=\max_{(x,y)\in W}\lambda(x,y),$$

so I set \(\lambda^*=\lambda_{\max}\). Of course with \(\lambda^*\) being an upper bound, you can use any larger \(\lambda\)-value, so \(\lambda^*\geq \lambda_{\max}\), but that just means more points will need to be thinned.

Scientific programming languages  have implemented algorithms that find or estimate minima of mathematical functions, meaning such an algorithm just needs to find the \((x,y)\) point that gives the minimum value of \(-\lambda(x,y)\), which corresponds to the maximum value of \(\lambda(x,y)\). What is very important is that the minimization procedure can handle constraints on the \(x\) and \(y\) values, which in our case of a rectangular simulation window \(W\) are sometimes called box constraints.

Thinning the Poisson point process

All we need to do now is to thin the homogeneous Poisson point process with the thinning probability function

$$1-p(x,y)=\frac{\lambda(x,y)}{\lambda^*}.$$

This will randomly remove the points  so the remaining points will form a inhomogeneous Poisson point process with intensity function
$$ (1-p(x,y))\lambda^* =\lambda(x).$$
As a result, we can see that provided \(\lambda^*\geq \lambda_{\max}>0\), this procedure will give the right intensity function \(\lambda(x,y)\).  I’ll skip the details on the point process still being Poisson after thinning, as I have already covered this in the thinning post.

Empirical check

You can run an empirical check by simulating the point process a large number (say \(10^3\) or \(10^4\)) of times, and collect statistics on the number of points. As the number of simulations increases, the average number of points should converge to the intensity measure \(\Lambda(W)\), which is given by (perhaps numerically) evaluating the integral

$$\Lambda(W)=\int_W \lambda(x,y)dxdy.$$

This is a test for the intensity measure, a type of first moment, which will work for the intensity measure of any point process. But for Poisson point processes, the variance of the number of points will also converge to intensity measure \(\Lambda(W)\), giving a second empirical test based on second moments.

An even more thorough test would be estimating an empirical distribution (that is, performing and normalizing a histogram) on the number of points. These checks will validate the number of points, but not the positioning of the points. In my next post I’ll cover how to perform these tests.

Results

The homogeneous Poisson point process with intensity function \(\lambda(x)=100\exp(-(x^2+y^2)/s^2)\), where \(s=0.5\).  The results look similar to those in the thinning post, where the thinned points (that is, red circles) are generated from the same Poisson point process as the one that I have presented here.

MATLAB

Python

Method extensions

We can extend the thinning method for simulating inhomogeneous Poisson point processes a couple different ways.

Using an inhomogeneous Poisson point process

The thinning method does not need to be applied to a homogeneous Poisson point process with intensity \(\lambda^*\). In theory,  we could have simulated a suitably inhomogeneous Poisson point process with intensity function \(\lambda^*(x,y)\), which has the condition

$$ \lambda^*(x,y)\geq \lambda(x,y), \quad \forall (x,y)\in W .$$

Then this Poisson point process is thinned. But then we would still need to simulate the underlying Poisson point process, which often would be as difficult to simulate.

Partitioning the simulation window

Perhaps the intensity of the Poisson point process only takes two values, \(\lambda_1\) and \(\lambda_2\), and the simulation window \(W\) can be nicely divided or partitioned into two disjoints sets \(B_1\) and \(B_2\) (that is, \(B_1\cap B_2=\emptyset\) and \(B_1\cup B_2=W\)), corresponding to the subregions of the two different intensity values.  The Poisson independence property allows us to simulate two independent Poisson point processes on the two subregions.

This approach only works for a piecewise constant intensity function. But if if the intensity function \(\lambda(x)\) varies wildly, the simulation window can be partitioned into subregions \(B_1\dots,B_m\) for different ranges of the intensity function \(\lambda(x)\).  This allows us to simulate independent homogeneous Poisson point processes with different densities \(\lambda^*_1\dots, \lambda^*_m\), where for each subregion \(B_i\) we set

$$ \lambda^*_i:=\max_{x\in B_i}\lambda(x,y).$$

The resulting Poisson point processes are then suitably thinned, resulting in a more efficient simulation method. (Although I imagine the gain would often be small.)

Non-rectangular simulation windows

If you want to simulate on non-rectangular regions, which is not a disk or triangle, then the easiest way is to simulate a Poisson point process on a rectangle \(R\) that completely covers the simulation window, so \(W \subset R\subset \mathbb{R}^2\), and then set the intensity function \(\lambda \) to zero for the region outside the simulation window \(W\), that is \(\lambda (x,y)=0\) when \((x,y)\in R\setminus W\).

Further reading

In Section 2.5.2 of Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke, there is an outline of the thinning method that I used.  The same simulation section appears in the previous edition by Kendall and Mecke, though these books in general have little material on simulation methods.

More details on the thinning method and its connection to acceptance-rejection sampling are given in Section 2.3 of the applications-oriented book Poisson Point Processes by Streit. The acceptance-rejection method is covered in, for example, books on Monte Carlo methods, including Monte Carlo Strategies in Scientific Computing by Liu (in Section 2.2 )and Monte Carlo Methods in Financial Engineering by Glasserman (in Section 2.2.2).  This method and others for simulating generals random variable are covered in stochastic simulation books such as Uniform Random Variate Generation by Devroye and Stochastic Simulation: Algorithms and Analysis by Asmussen and Glynn.

Kroese and Botev have a good introduction to stochastic simulation in the edited collection Stochastic Geometry, Spatial Statistics and Random Fields : Models and Algorithms by Schmidt, where the relevant chapter (number 12) is also freely available online.  And of course there are lectures notes on the internet that cover simulation material.

Code

All code from my posts, as always, can be found on the my GitHub repository. The code for this post is located here.  You can see that the code is very similar to that of the thinning code, which served as the foundation for this code. (Note how we now keep the points, so in the code the > has become  < on the line where the uniform variables are generated).

I have implemented the code in MATLAB and Python with an intensity function \(\lambda(x,y)=100\exp(-(x^2+y^2)/s^2)\), where \(s>0\) is a scale parameter.  Note that in the minimization step, the box constraints are expressed differently in MATLAB and Python: MATLAB first takes the minimum values then the maximum values, whereas Python first takes the \(x\)-values then the \(y\)-values.

The code presented here does not have the empirical check, which I described above, but it is implemented in the code located here.  For the parameters used in the code, the total measure is \(\Lambda(W)\approx 77.8068\), meaning each simulation will generate on average almost seventy-eight points.

I have stopped writing code in R for a couple of reasons, but mostly because anything I could think of simulating in R can already be done in the spatial statistics library spatstat. I recommend the book Spatial Point Patterns, co-authored by the spatstat’s main contributor, Adrian Baddeley.

MATLAB

I have used the fmincon function to find the point that gives the minimum of \(-\lambda(x,y)\).

%Simulation window parameters
xMin=-1;xMax=1;
yMin=-1;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; %rectangle dimensions
areaTotal=xDelta*yDelta; %area of rectangle

s=0.5; %scale parameter

%Point process parameters
fun_lambda=@(x,y)(100*exp(-((x).^2+(y).^2)/s^2));%intensity function

%%%START -- find maximum lambda -- START %%%
%For an intensity function lambda, given by function fun_lambda,
%finds the maximum of lambda in a rectangular region given by
%[xMin,xMax,yMin,yMax].
funNeg=@(x)(-fun_lambda(x(1),x(2))); %negative of lambda
%initial value(ie centre)
xy0=[(xMin+xMax)/2,(yMin+yMax)/2];%initial value(ie centre)
%Set up optimization step
options=optimoptions('fmincon','Display','off');
%Find largest lambda value
[~,lambdaNegMin]=fmincon(funNeg,xy0,[],[],[],[],...
[xMin,yMin],[xMax,yMax],'',options);
lambdaMax=-lambdaNegMin;
%%%END -- find maximum lambda -- END%%%

%define thinning probability function
fun_p=@(x,y)(fun_lambda(x,y)/lambdaMax);

%Simulate Poisson point process
numbPoints=poissrnd(areaTotal*lambdaMax);%Poisson number of points
xx=xDelta*(rand(numbPoints,1))+xMin;%x coordinates of Poisson points
yy=xDelta*(rand(numbPoints,1))+yMin;%y coordinates of Poisson points

%calculate spatially-dependent thinning probabilities
p=fun_p(xx,yy);

%Generate Bernoulli variables (ie coin flips) for thinning
booleRetained=rand(numbPoints,1)<p; %points to be thinned

%x/y locations of retained points
xxRetained=xx(booleRetained); yyRetained=yy(booleRetained);

%Plotting
plot(xxRetained,yyRetained,'bo'); %plot retained points
xlabel('x');ylabel('y');

The box constraints for the optimization step were expressed as:

[xMin,yMin],[xMax,yMax]
Python

I have used the minimize function in SciPy.

import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #For plotting
from scipy.optimize import minimize #For optimizing
from scipy import integrate

#Simulation window parameters
xMin=-1;xMax=1;
yMin=-1;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

s=0.5; #scale parameter

#Point process parameters
def fun_lambda(x,y):
   return 100*np.exp(-(x**2+y**2)/s**2); #intensity function

###START -- find maximum lambda -- START ###
#For an intensity function lambda, given by function fun_lambda,
#finds the maximum of lambda in a rectangular region given by
#[xMin,xMax,yMin,yMax].
def fun_Neg(x):
   return -fun_lambda(x[0],x[1]); #negative of lambda

xy0=[(xMin+xMax)/2,(yMin+yMax)/2];#initial value(ie centre)
#Find largest lambda value
resultsOpt=minimize(fun_Neg,xy0,bounds=((xMin, xMax), (yMin, yMax)));
lambdaNegMin=resultsOpt.fun; #retrieve minimum value found by minimize
lambdaMax=-lambdaNegMin;
###END -- find maximum lambda -- END ###

#define thinning probability function
def fun_p(x,y):
   return fun_lambda(x,y)/lambdaMax;

#Simulate a Poisson point process
numbPoints = np.random.poisson(lambdaMax*areaTotal);#Poisson number of points
xx = np.random.uniform(0,xDelta,((numbPoints,1)))+xMin;#x coordinates of Poisson points
yy = np.random.uniform(0,yDelta,((numbPoints,1)))+yMin;#y coordinates of Poisson points

#calculate spatially-dependent thinning probabilities
p=fun_p(xx,yy);

#Generate Bernoulli variables (ie coin flips) for thinning
booleRetained=np.random.uniform(0,1,((numbPoints,1)))<p; #points to be thinned

#x/y locations of retained points
xxRetained=xx[booleRetained]; yyRetained=yy[booleRetained];

#Plotting
plt.scatter(xxRetained,yyRetained, edgecolor='b', facecolor='none', alpha=0.5 );
plt.xlabel("x"); plt.ylabel("y");
plt.show();

The box constraints  were expressed as:

(xMin, xMax), (yMin, yMax)
Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.