Testing the Julia language with point process simulations

I started writing these posts (or blog entries) about a year ago. In my first post I remarked how I wanted to learn to write stochastic simulations in a new language. Well, I found one. It’s called Julia. Here’s my code. And here are my thoughts.

Overview

For scientific programming, the Julia language has arisen as a new contender. Originally started in 2012, its founders and developers have (very) high aspirations, wanting the language to be powerful and accessible, while still having run speeds comparable to C. There’s been excitement about it, and even a Nobel Laureate in economics, Thomas Sargent, has endorsed it. He co-founded the QuantEcon project, whose website has this handy guide or cheat sheet for commands between MATLAB, Python and Julia.

That guide suggests that Julia’s main syntax inspiration comes from MATLAB. But perhaps its closest (and greatest) competitor in scientific programming languages is Python, which has become a standard language used in scientific programming, particularly in machine learning. Another competitor is the statistics language R, which is popular for data science. But R is not renown for its speed.

As an aside, machine learning is closely related to what many call data science. I consider the two disciplines as largely overlapping with statistics, where their respective emphases are on theory and practice. In these fields, often the languages Python and R are used. There are various websites discussing which language is better, such as this one, which in turn is based on this one. In general, it appears that computer scientists and statisticians respectively prefer using Python and R.

Returning to the Julia language, given its young age, the language is still very much evolving, but I managed to find suitable Julia functions for stochastic simulations. I thought I would try it out by simulating some point processes, which I have done several times before. I successfully ran all my code with Julia Version 1.0.3.

In short, I managed to replicate in (or even translate to) Julia the code that I presented in the following posts:

Simulating a homogeneous Poisson point process on a rectangle

Simulating a Poisson point process on a disk

Simulating a Poisson point process on a triangle

Simulating an inhomogeneous Poisson point process

Simulating a Matérn cluster point process

Simulating a Thomas cluster point process

The Julia code, like all the code I present here, can be found on my Github repository, which for this post is located here.

Basics

Language type and syntax

The Wikipedia article on Julia says:

Julia is a high-level general-purpose dynamic programming language designed for high-performance numerical analysis and computational science.

Scientific programming languages like the popular three MATLAB, R and Python, are interpreted languages. But the people behind Julia say:

it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages.

Because Julia’s compiler is different from the interpreters used for languages like Python or R, you may find that Julia’s performance is unintuitive at first.

I already remarked that Julia’s syntax is clearly inspired by MATLAB, as one can see in this guide for MATLAB, Python and Julia. But there are key differences. For example, to access an array entry in Julia, you use square brackets (like in most programming languages), whereas parentheses are used in MATLAB.

Packages

Julia requires you to install certain packages or libraries, like most languages. For random simulations and plots, you have to install the Julia packages Distributions and Plots, which is done by running the code.

Pkg.add("Distributions");
Pkg.add("Plots");

After doing that, it’s best to restart Julia. These packages are loaded with the using command:

Using Distributions;
Using Plots;

Also, the first time it takes a while to run any code using those newly installed packages.

I should stress that there are different plotting libraries, but Plots, which contains many plotting libraries, is the only one I could get working. Another is PlotPy, which uses the Python library. As a beginner, it seems to me that the Julia community has not focused too much on developing new plotting functions, and has instead leveraged pre-existing libraries.

For standard scientific and statistical programming, you will usually also need the packages LinearAlgebra and Statistics.

Data types

Unlike MATLAB or R, Julia is a language that has different data types for numbers, such as integers and floating-point numbers (or floats). This puts Julia in agreement with the clear majority of languages, making it nothing new for most programmers. This is not a criticism of the language, but this can be troublesome if you’ve grown lazy after years of using MATLAB and R.

Simulating random variables

In MATLAB, R and Python, we just need to call a function for simulating uniform, Poisson, and other random variables. There’s usually a function for each type of random variable (or probability distribution).

Julia does simulation of random objects in a more, let’s say, object-oriented way (but I’m told, it’s not an object-oriented language). The probability distributions of random variables are objects, which are created and then sent to a general function for random generation. For example, here’s the code for simulating a Poisson variable with mean \(\mu=10\).

mu=10;
distPoisson=Poisson(mu);
numbPoisson=rand(distPoisson);

Similarly, here’s how to simulate a normal variable with mean \(\mu=10\) and standard deviation \(\sigma=1\).

sigma=1;
distNormal=Normal(mu,sigma);
numbNormal=rand(distNormal);

Of course the last two lines can be collapsed into one.

mu=10;
sigma=1;
numbNormal=rand(Normal(mu,sigma));

But if you just want to create standard uniform variables on the interval (0,1), then the code is like that in MATLAB. For example, this code creates a \(4\times3\) matrix (or array) \(X\) whose entries are simulation outcomes of independent uniform random variables:

X=rand(4,3);

The resulting matrix \(X\) is a Float 64 array.

Arrays

The indexing of arrays in Julia starts at one, like MATLAB and R. When you apply a function to an array, you generally need to use the dot notation. For example, if I try to run the code:

Y=sqrt(rand(10,1)); #This line will result in an error.

then on my machine (with Julia Version 1.0.3) I get the error:

ERROR: DimensionMismatch(“matrix is not square: dimensions are (10, 1)”)

But this code works:

Y=sqrt.(rand(10,1));

Also, adding scalars to arrays can catch you in Julia, as you also often need to use the dot notation. This code:

Y=sqrt.(rand(10,1));
Z=Y+1; #This line will result in an error.

gives the error:

ERROR: MethodError: no method matching +(::Array{Float64,2}, ::Int64)

This is fixed by adding a dot:

Y=sqrt.(rand(10,1));
Z=Y.+1; #This line will work.

Note the dot has to be on the left hand side. I ended up just using dot notation every time to be safe.

Other traps exist. For example, with indexing, you need to convert floats to integers if you want to use them as indices.

Repeating array elements

There used to be a Julia function called repmat, like the one in MATLAB , but it was merged with a function called repeat. I used such repeating operations to avoid explicit for-loops, which is generally advised in languages like MATLAB and R. For example, I used the repelem function in MATLAB to simulate Matérn and Thomas cluster point processes. To do this in Julia, I had to use this nested construction:

y=vcat(fill.(x, n)...);

This line means that the first value in \(x \) is repeated \(n[1]\) times, where \(n[1]\) is the first entry of \(n\) (as indexing in Julia starts at one), then the second value of \(x\) is repeated \(n[2]\) times, and so on. For example, the vectors \(x=[7,4,9]\) and \(n=[2,1,3]\), the answer is \(y=[7,7,4,9,9,9]\).

To do this in Julia, the construction is not so bad, if you know how, but it’s not entirely obvious. In MATLAB I use this:

y=repelem(x,n);

Similarly in Python:

y=np.repeat(x,n);
Different versions of Julia

I found that certain code would work (or not work) and then later the same code would not work (or would work) on machines with different versions of Julia, demonstrating how the language is still being developed. More specifically, I ran code on Julia Version 1.0.3 (Date 2018-12-18) and Julia Version 0.6.4 (Date: 2018-07-09). (Note how there’s only a few months difference in the dates of the two versions.)

Consider the code with the errors (due to the lack of dot operator) in the previous section. The errors occurred on one machine with Julia Version 1.0.3, but the errors didn’t occur on another machine with the older Julia Version 0.6.4. For a specific example, the code:

Y=sqrt.(rand(10,1));
Z=Y+1; #This line will not result in an error on Version 0.6.4.

gives no error with Julia Version 0.6.4, while I have already discussed how it gives an error with Julia Version 1.0.3.

For another example, I copied from this MATLAB-Python-Julia guide the following command:

A = Diagonal([1,2,3]); #This line will (sometimes?) result in an error.

It runs on machine with Julia Version 0.6.4 with no problems. But running it on the machine with Julia Version 1.0.3 gives the error:

ERROR: UndefVarError: Diagonal not defined

That’s because I have not used the LinearAlgebra package. Fixing this, the following code:

using LinearAlgebra; #Package needed for Diagonal command.
A = Diagonal([1,2,3]); #This line should now work.

gives no error with Julia Version 1.0.3.

If you have the time and energy, you can search the internet and find online forums where the Julia developers have discussed why they have changed something, rendering certain code unworkable with the latest versions of Julia.

Optimization

It seems that performing optimization on functions is done with the Optim package.

Pkg.add("Optim");

But some functions need the Linesearches package, so it’s best to install that as well.

Pkg.add("Linesearches");

Despite those two optimization packages, I ended up using yet another package called BlackBoxOptim.

Pkg.add("BlackBoxOptim");

In this package, I used a function called bboptimize. This is the first optimziation function that I managed to get working. I do not know how it compares to the functions in the Optim and Linesearches packages.

In a previous post, I used optimization functions to simulate a inhomogeneous or nonhomogeneous Poisson point process on a rectangle. I’ve also written Julia code for this simulation, which is found below. I used bboptimize, but I had some problems when I initially set the search regions to integers, which the package did not like, as the values need to be floats. That’s why I multiple the rectangle dimensions by \(1.0\) in the following code:

boundSearch=[(1.0xMin,1.0xMax), (1.0yMin, 1.0yMax)]; #bounds for search box
#WARNING: Values of boundSearch cannot be integers!
resultsOpt=bboptimize(fun_Neg;SearchRange = boundSearch);
lambdaNegMin=best_fitness(resultsOpt); #retrieve minimum value found by bboptimize

Conclusion

In this brief experiment, I found the language Julia good for doing stochastic simulations, but too tricky for doing simple things like plotting. Also, depending on the version of Julia, sometimes my code would work and sometimes it wouldn’t. No doubt things will get better with time.

Further reading

As I said, Julia is still very much an ongoing project. Here’s a couple of links that helped me learn the basics.

https://en.wikibooks.org/wiki/Introducing_Julia/Arrays_and_tuples

https://voxeu.org/content/which-numerical-computing-language-best-julia-matlab-python-or-r

Julia, Matlab, and C

https://modelingguru.nasa.gov/docs/DOC-2676

Code

I’ve only posted here code for some of simulations, but the rest of the code is available on my GitHub repository located here. You can see how the code is comparable to that of MATLAB.

Poisson point process on a rectangle

I wrote about this point process here. The code is located here.

using Distributions #for random simulations
using Plots #for plotting

#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints=rand(Poisson(areaTotal*lambda)); #Poisson number of points
xx=xDelta*rand(numbPoints,1).+xMin;#x coordinates of Poisson points
yy=yDelta*(rand(numbPoints,1)).+yMin;#y coordinates of Poisson points

#Plotting
plot1=scatter(xx,yy,xlabel ="x",ylabel ="y", leg=false);
display(plot1);
Inhomogeneous Poisson point process on a rectangle

I wrote about this point process here. The code is located here.

using Distributions #for random simulations
using Plots #for plotting
using BlackBoxOptim #for blackbox optimizing

#Simulation window parameters
xMin=-1;xMax=1;
yMin=-1;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

s=0.5; #scale parameter

#Point process parameters
function fun_lambda(x,y)
    100*exp.(-(x.^2+y.^2)/s^2); #intensity function
end

###START -- find maximum lambda -- START ###
#For an intensity function lambda, given by function fun_lambda,
#finds the maximum of lambda in a rectangular region given by
#[xMin,xMax,yMin,yMax].
#NOTE: Need xMin, xMax, yMin, yMax to be floats eg xMax=1. See boundSearch

function fun_Neg(x)
     -fun_lambda(x[1],x[2]); #negative of lambda
end
xy0=[(xMin+xMax)/2.0,(yMin+yMax)/2.0];#initial value(ie centre)

#Find largest lambda value
boundSearch=[(1.0xMin,1.0xMax), (1.0yMin, 1.0yMax)];
#WARNING: Values of boundSearch cannot be integers!
resultsOpt=bboptimize(fun_Neg;SearchRange = boundSearch);
lambdaNegMin=best_fitness(resultsOpt); #retrieve minimum value found by bboptimize
lambdaMax=-lambdaNegMin;
###END -- find maximum lambda -- END ###

#define thinning probability function
function fun_p(x,y)
    fun_lambda(x,y)/lambdaMax;
end

#Simulate a Poisson point process
numbPoints=rand(Poisson(areaTotal*lambdaMax)); #Poisson number of points
xx=xDelta*rand(numbPoints,1).+xMin;#x coordinates of Poisson points
yy=yDelta*(rand(numbPoints,1)).+yMin;#y coordinates of Poisson points

#calculate spatially-dependent thinning probabilities
p=fun_p(xx,yy);
#Generate Bernoulli variables (ie coin flips) for thinning
booleRetained=rand(numbPoints,1).<p; #points to be retained
xxRetained=xx[booleRetained]; yyRetained=yy[booleRetained];

#Plotting
plot1=scatter(xxRetained,yyRetained,xlabel ="x",ylabel ="y", leg=false);
display(plot1);
Thomas point process on a rectangle

I wrote about this point process here. The code is located here.

using Distributions #for random simulations
using Plots #for plotting

#Simulation window parameters
xMin=-.5;
xMax=.5;
yMin=-.5;
yMax=.5;

#Parameters for the parent and daughter point processes
lambdaParent=10;#density of parent Poisson point process
lambdaDaughter=10;#mean number of points in each cluster
sigma=0.05; #sigma for normal variables (ie random locations) of daughters

#Extended simulation windows parameters
rExt=7*sigma; #extension parameter
#for rExt, use factor of deviation sigma eg 6 or 7
xMinExt=xMin-rExt;
xMaxExt=xMax+rExt;
yMinExt=yMin-rExt;
yMaxExt=yMax+rExt;
#rectangle dimensions
xDeltaExt=xMaxExt-xMinExt;
yDeltaExt=yMaxExt-yMinExt;
areaTotalExt=xDeltaExt*yDeltaExt; #area of extended rectangle

#Simulate Poisson point process
numbPointsParent=rand(Poisson(areaTotalExt*lambdaParent)); #Poisson number of points

#x and y coordinates of Poisson points for the parent
xxParent=xMinExt.+xDeltaExt*rand(numbPointsParent,1);
yyParent=yMinExt.+yDeltaExt*rand(numbPointsParent,1);

#Simulate Poisson point process for the daughters (ie final poiint process)
numbPointsDaughter=rand(Poisson(lambdaDaughter),numbPointsParent);
numbPoints=sum(numbPointsDaughter); #total number of points

#Generate the (relative) locations in Cartesian coordinates by
#simulating independent normal variables
xx0=rand(Normal(0,sigma),numbPoints);
yy0=rand(Normal(0,sigma),numbPoints);

#replicate parent points (ie centres of disks/clusters)
xx=vcat(fill.(xxParent, numbPointsDaughter)...);
yy=vcat(fill.(yyParent, numbPointsDaughter)...);

#Shift centre of disk to (xx0,yy0)
xx=xx.+xx0;
yy=yy.+yy0;

#thin points if outside the simulation window
booleInside=((xx.>=xMin).&(xx.<=xMax).&(yy.>=yMin).&(yy.<=yMax));
#retain points inside simulation window
xx=xx[booleInside];
yy=yy[booleInside];

#Plotting
plot1=scatter(xx,yy,xlabel ="x",ylabel ="y", leg=false);
display(plot1);

Simulating a Matérn cluster point process

A Matérn cluster point process is a type of cluster point process, meaning that its randomly located points tend to form random clusters. (I skip the details here, but by using techniques from spatial statistics, it is possible to make the definition of clustering more precise.) This point process is an example of a family of cluster point processes known as Neyman-Scott point processes, which have been used in spatial statistics and telecommunications.

I should point out that the Matérn cluster point process should not be confused with the Matérn hard-core point process, which is a completely different type of point process. (For a research article, I have actually written code in MATLAB that simulates this type of point process.) Bertril Matérn proposed at least four types of point processes, and his name also refers to a specific type of covariance function used to define Gaussian processes.

Overview

Simulating a Matérn cluster point process requires first simulating  a homogeneous Poisson point process with an intensity \(\lambda>0\) on some simulation window, such as a rectangle, which is the simulation window I will use here. Then for each point of this underlying point process, simulate a Poisson number of points with mean \(\mu>0\) uniformly on a disk with a constant radius \(r>0\). The underlying  point process is sometimes called the parent (point) process, and its points are centres of the cluster disks.

The subsequent point process on all the disks is called daughter (point) process and it forms the clusters. I have already written about simulating the homogeneous Poisson point processes on a rectangle and a disk, so those posts are good starting points, and I will not focus  too much on details for these steps.

Edge effects

The main challenge behind sampling this point process, which I originally forgot about in an earlier version of this post, is that it’s possible for daughter points to appear in the simulation window that come from parents points outside the simulation window. In other words, parents points outside the simulation window contribute to points inside the window.

To remove these edge effects, the point processes must be simulated on an extended version of the simulation window. Then only the daughter points within the simulation window are kept and the rest are removed.  Consequently, the points are simulated on an extended window, but we only see the points inside the simulation window.

To create the extended simulation window, we can add a strip of width \(r\) all around the simulation window. Why? Well, the distance \(r\) is the maximum distance from the simulation window that a possibly contributing parent point (outside the simulation window) can exist, while still having daughter points inside the simulation window. This means it is impossible for a hypothetical parent point beyond this distance (outside the extended window) to generate a daughter point that can fall inside the simulation window. 

Steps

Number of points

Simulate the underlying or parent Poisson point process on the rectangle with \(N_P\) points. Then for each point, simulate a Poisson number of points, where each disk now has \(D_i\) number of points. Then the total number of points is simply \(N=D_1+\dots +D_{P}=\sum_{i=1}^{N_P}D_i \). The random variables \(P\) and \(D_i\) are Poisson random variables with respective means \(\lambda A\) and \(\mu\), where \(A\) is the area of the rectangular simulation window. To simulate these random variables in MATLAB, use the poissrnd function. To do this in R, use the  standard function rpois. In Python, we can use either functions scipy.stats.poisson or numpy.random.poisson from the SciPy or NumPy libraries.

Locations of points

The points of the parent point process are randomly positioned by using Cartesian coordinates. For a homogeneous Poisson point process, the \(x\) and \(y\) coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in a previous post. The points of all the daughter point process are randomly positioned by using polar  coordinates. For a homogeneous Poisson point process, the \(\theta\) and \(\rho\) coordinates of each point are independent  variables,  respectively with uniform and triangle distributions, which was covered in a previous post. Then we  convert coordinates back to Cartesian form, which is easily done in MATLAB with the pol2cart function. In languages without such a function: \(x=\rho\cos(\theta)\) and \(y=\rho\sin(\theta)\).

Shifting all the points in each cluster disk

In practice (that is, in the code), all the daughter points are simulated in a disk with its centre at the origin. Then for each cluster disk, all the points have to be shifted to the origin is the center of the cluster, which completes the simulation step.

To use vectorization in the code,  the coordinates of each cluster point are repeated by the number of daughters in the corresponding cluster by using the functions repelem in MATLAB, rep in R, and repeat in Python. 

Code

I’ll now give some code in MATLAB, R and Python, which you can see are all very similar. It’s also located here.

MATLAB

The MATLAB code is located here.

R

The R code is located here.

Of course, as I have mentioned before, simulating a spatial point processes in R is even easier with the powerful spatial statistics library spatstat. The Matérn cluster point process is simulated by using the function rMatClust, but other cluster point processes, including Neyman-Scott types, are possible.

Python

The Pyyhon code is located here.

Note: in previous posts I used the SciPy functions for random number generation, but now use the NumPy ones, but there is little difference, as SciPy builds off NumPy.

Julia 

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

MATLAB

R

Python  

Further reading

Matérn cluster point processes and more generally Neyman-Scott point processes are covered in standard books on the related fields of spatial statistics, point processes and stochastic geometry, such as the following: Spatial Point Patterns: Methodology and Applications with R by  Baddeley, Rubak and Turner, on page 461; Statistical Analysis and Modelling of Spatial Point Patterns Statistics by Illian, Penttinen, Stoyan, amd Stoyan, Section 6.3.2, starting on page 370; Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke,  on page 173; and; Statistical Inference and Simulation for Spatial Point Processes by Moller and Waagepetersen, in Section 5.3. I would probably recommend the first two books for beginners.

The Matérn point process has also appeared in models of wireless networks, which we covered in the book Stochastic Geometry Analysis of Cellular Networks by Błaszczyszyn, Haenggi, Keeler, and Mukherjee, Section 8.1.8.

There is the book Spatial Point Patterns: Methodology and Applications with R written by spatial statistics experts  Baddeley, Rubak and Turner, which covers the spatial statistics (and point process simulation) R-package spatstat.

Simulating a Poisson point process on a disk

Sometimes one needs to simulate a Poisson point process on a disk. I know I often do. A disk or disc, depending on your spelling preference, is isotropic or rotationally-invariant, so a lot of my simulations of a Poisson point process happen in a circular simulation window when I am considering such a setting. For example, maybe you want to consider a single wireless receiver in a Poisson network of wireless transmitters, which only cares about the distance to a transmitter. Alternatively, maybe you want to randomly sprinkle a virtual cake. What to do? A Poisson point process on a disk is the answer.

I will simulate a Poisson point process with intensity \(\lambda>0\) on a disk with radius \(r>0\). The simulation steps are very similar to those in the previous post where I simulated a  homogeneous Poisson point process on a rectangle, and I suggest going back to that post if you are not familiar with the material. The main difference between simulation on a rectangle and a disk is the positioning of the points, but first we’ll look at the number of points.

Steps

Number of points

The number of points of a Poisson point process falling within a circle of radius \(r>0\) is a Poisson random variable with mean  \(\lambda A\), where \(A=\pi r^2\) is the area of the disk. As in the rectangular case, this is the most complicated part of the simulation procedure. But as long as your preferred programming language can produce (pseudo-)random numbers according to a Poisson distribution, you can simulate a homogeneous Poisson point process on a disk.

To do this in MATLAB,  use the poissrnd function with the argument \(\lambda A\). In R, it is done similarly with the standard  function rpois . In Python, we can use either the scipy.stats.poisson or numpy.random.poisson function from the SciPy or NumPy libraries.

Locations of points

We need to position all the points randomly and uniformly on a disk. In the case of the rectangle, we worked in Cartesian coordinates. It is then natural that we now work in polar coordinates.  I’ll denote the angular and radial coordinate respectively by \(\theta\) and \(\rho\). To generate the random angular (or \(\theta\)) values, we simply produce uniform random variables between zero and one, which is what all standard (pseudo-)random number generators do in programming languages. But we then multiply all these numbers by \(2\pi\), meaning that all the numbers now fall between \(0\) and \(2\pi\).

To generate the random radial (or \(\rho\)) values, a reasonable guess would be to do the same as before and generate uniform random variables between zero and one, and then multiply them by the disk radius \(r\). But that would be wrong. Before multiplying uniform random variables by the radius, we must take the square root of all the random numbers. We then multiply them by the radius, generating random variables between \(0\) and \(r\). (We must take the square root because the area element of a sector or disk is proportional to the radius squared, and not the radius.) These random numbers do not have a uniform distribution, due to the square root, but in fact their distribution is an example of the triangular distribution, which is defined with three real-valued parameters \(a\), \(b\) and \(c\), and for our case, set \(a=0\) and \(b=c=r\).

In summary, if \(U\) and \(V\) are two independent uniform random variables on \((0,1)\), then random point located uniformly on a disk of radius \(r\) has the polar coordinates \((r\sqrt(U), 2\pi V)\).

From polar to Cartesian coordinates

That’s it. We have generated polar coordinates for points randomly and uniformly located in a disk. But to plot the points, often we have to convert coordinates back to Cartesian form. This is easily done in MATLAB with the pol2cart function. In languages without such a function, trigonometry comes to the rescue: \(x=\rho\cos(\theta)\) and \(y=\rho\sin(\theta)\).

Equal x and y axes

Sometimes the plotted points more resemble points on an ellipse than a disk due to the different scaling of the x and y axes. To fix this in MATLAB, run the command: axis square. In Python, set axis(‘equal’) in your plot; see this page for a demonstration.

Code

I’ll now give some code in MATLAB, R and Python, which you can see are all very similar

MATLAB
%Simulation window parameters
r=1; %radius of disk
xx0=0; yy0=0; %centre of disk

areaTotal=pi*r^2; %area of disk
 
%Point process parameters
lambda=100; %intensity (ie mean density) of the Poisson process
 
%Simulate Poisson point process
numbPoints=poissrnd(areaTotal*lambda);%Poisson number of points
theta=2*pi*(rand(numbPoints,1)); %angular coordinates
rho=r*sqrt(rand(numbPoints,1)); %radial coordinates

%Convert from polar to Cartesian coordinates
[xx,yy]=pol2cart(theta,rho); %x/y coordinates of Poisson points

%Shift centre of disk to (xx0,yy0)
xx=xx+xx0;
yy=yy+yy0;
 
%Plotting
scatter(xx,yy);
xlabel('x');ylabel('y');
axis square;
R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.

#Simulation window parameters
r=1; #radius of disk
xx0=0; yy0=0; #centre of disk

areaTotal=pi*r^2; #area of disk

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints=rpois(1,areaTotal*lambda);#Poisson number of points
theta=2*pi*runif(numbPoints);#angular  of Poisson points
rho=r*sqrt(runif(numbPoints));#radial coordinates of Poisson points

#Convert from polar to Cartesian coordinates
xx=rho*cos(theta);
yy=rho*sin(theta);

#Shift centre of disk to (xx0,yy0)
xx=xx+xx0;
yy=yy+yy0;

#Plotting
par(pty="s")
plot(xx,yy,'p',xlab='x',ylab='y',col='blue');

Of course, with the amazing spatial statistics library spatstat, simulating a Poisson point process in R is even easier.

library("spatstat"); #load spatial statistics library
#create Poisson "point pattern" object
ppPoisson=rpoispp(lambda,win=disc(radius=r,centre=c(xx0,yy0))) 
plot(ppPoisson); #Plot point pattern object
#retrieve x/y values from point pattern object
xx=ppPoisson$x; 
yy=ppPoisson$y;

Actually, you can even do it all in two lines: one for loading the spatstat library and one for creating and plotting the point pattern object.

Python

Note: “lambda” is a reserved word in Python (and other languages), so I have used “lambda0” instead.

import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #for plotting

#Simulation window parameters
r=1;  #radius of disk
xx0=0; yy0=0; #centre of disk
areaTotal=np.pi*r**2; #area of disk

#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints = np.random.poisson(lambda0*areaTotal);#Poisson number of points
theta=2*np.pi*np.random.uniform(0,1,numbPoints); #angular coordinates 
rho=r*np.sqrt(np.random.uniform(0,1,numbPoints)); #radial coordinates 

#Convert from polar to Cartesian coordinates
xx = rho * np.cos(theta);
yy = rho * np.sin(theta);

#Shift centre of disk to (xx0,yy0) 
xx=xx+xx0; yy=yy+yy0;

#Plotting
plt.scatter(xx,yy, edgecolor='b', facecolor='none', alpha=0.5 );
plt.xlabel("x"); plt.ylabel("y");
plt.axis('equal');
Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

MATLAB
R 
Python  

Further reading

The third edition of the classic book Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke details on page 54 how to uniformly place points on a disk, which they call the radial way. The same simulation section appears in the previous edition by Chiu, Kendall and Mecke (Chiu didn’t appear as an author until the current edition), though these books in general have little material on simulation methods. There is the book Spatial Point Patterns: Methodology and Applications with R written by spatial statistics experts  Baddeley, Rubak and Turner, which covers the spatial statistics (and point process simulation) R-package spatstat.

Simulating a homogeneous Poisson point process on a rectangle

This is the first of a series of posts about simulating Poisson point processes. I’ll start with arguably the simplest Poisson point process on two-dimensional space, which is the homogeneous one defined on a rectangle. Let’s say that we we want to simulate a Poisson point process with intensity \(\lambda>0\) on a (bounded) rectangular region, for example, the rectangle \([0,w]\times[0,h]\) with dimensions \(w>0\) and \(h>0\) and area \(A=wh\). We assume for now that the bottom left corner of the rectangle is at the origin.

Steps

Number of points

The number of points in the rectangle  \([0,w]\times[0,h]\) is a Poisson random variable with mean \(\lambda A\). In other words, this random variable is distributed according to the Poisson distribution with parameter \(\lambda A\), and not just \(\lambda\), because the number of points depends on the size of the simulation region.

This is the most complicated part of the simulation procedure. As long as your preferred programming language can produce (pseudo-)random numbers according to a Poisson distribution, you can simulate a homogeneous Poisson point process. There’s a couple of different ways used to simulate Poisson random variables, but we will skip the details. In MATLAB, it is done by using the poissrnd function with the argument \(\lambda A\). In R, it is done similarly with the standard  function rpois . In Python, we can use either the scipy.stats.poisson or numpy.random.poisson function from the SciPy or NumPy libraries.

Location of points

The points now need to be positioned randomly, which is done by using Cartesian coordinates. For a homogeneous Poisson point process, the \(x\) and \(y\) coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in a previous post. For the rectangle \([0,w]\times[0,h]\), the \(x\) coordinates are uniformly sampled on the interval \([0,w]\), and similarly for the \(y\) coordinates. If the bottom left corner of rectangle is located at the point \((x_0,y_0)\), then we just have to shift the random \(x\) and \(y\) coordinates by respectively adding \(x_0\) and \(y_0\).

Every scientific programming language has a random uniform number generator because it is the default random number generator. In MATLAB, R and SciPy, it is respectively rand, runif and scipy.stats.uniform.

Code

Here is some code that I wrote for simulating a homogeneous Poisson point process on a rectangle. You will notice that in all the code samples the part that simulates the Poisson point process requires only three lines of code: one line for the number of points and two lines lines for the \(x\) and \(y\) coordinates  of the points.

MATLAB
%Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; %rectangle dimensions
areaTotal=xDelta*yDelta;

%Point process parameters
lambda=100; %intensity (ie mean density) of the Poisson process

%Simulate Poisson point process
numbPoints=poissrnd(areaTotal*lambda);%Poisson number of points
xx=xDelta*(rand(numbPoints,1))+xMin;%x coordinates of Poisson points
yy=xDelta*(rand(numbPoints,1))+yMin;%y coordinates of Poisson points

%Plotting
scatter(xx,yy);
xlabel('x');ylabel('y');
R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.

#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints=rpois(1,areaTotal*lambda);#Poisson number of points
xx=xDelta*runif(numbPoints)+xMin;#x coordinates of Poisson points
yy=xDelta*runif(numbPoints)+yMin;#y coordinates of Poisson points

#Plotting
plot(xx,yy,'p',xlab='x',ylab='y',col='blue');
Python

Note: “lambda” is a reserved word in Python (and other languages), so I have used “lambda0” instead.

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints = scipy.stats.poisson( lambda0*areaTotal ).rvs()#Poisson number of points
xx = xDelta*scipy.stats.uniform.rvs(0,1,((numbPoints,1)))+xMin#x coordinates of Poisson points
yy = yDelta*scipy.stats.uniform.rvs(0,1,((numbPoints,1)))+yMin#y coordinates of Poisson points
#Plotting
plt.scatter(xx,yy, edgecolor='b', facecolor='none', alpha=0.5 )
plt.xlabel("x"); plt.ylabel("y")
Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

MATLAB

PoissonMatlab

R

PoissonR

Python

PoissonPython

Further reading

For simulation of point processes, see, for example, the books Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen, or Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke. There are books written by spatial statistics experts such as Stochastic Simulation by Ripley and Spatial Point Patterns: Methodology and Applications with R by Baddeley, Rubak and Turner, where the second book covers the spatial statistics R-package spatstat. Kroese and Botev also have a good introduction in the edited collection Stochastic Geometry, Spatial Statistics and Random Fields : Models and Algorithms by Schmidt, where the relevant chapter (number 12) is also freely available online. More general stochastic simulation books that cover relevant material include Uniform Random Variate Generation by Devroye and Stochastic Simulation: Algorithms and Analysis by Asmussen and Glynn.