Simulating a Thomas cluster point process

Sometimes with just a little tweaking of a point process, you can get a new  point process. An example of this is the Thomas point process, which is a type of cluster point process, meaning that its randomly located points tend to form random clusters.  This point process is an example of a family of cluster point processes known as Neyman-Scott point processes, which have been used as models in spatial statistics and telecommunications. If that sounds familiar, that is because this point process is very similar to the Matérn point cluster process, which I covered in the previous post

The only difference between the two point processes is how the points are randomly located. In each cluster of a Thomas point process, each individual point is located according to two independent zero-mean normal variables with variance \(\sigma^2\), describing the \(x\) and \(y\) coordinates relative to the cluster centre, whereas each point of a Matérn point process is located uniformly in a disk.

Working in polar coordinates, an equivalent way to simulate a Thomas point process is to use independent and identically-distributed Rayleigh random variables for the radial (or \(\rho\)) coordinates, instead of using random variables with a triangular distribution, which are used to simulate the Matérn point process. This method works because in polar coordinates a uniform random variable for the angular (or \(\theta\) ) coordinate and a Rayleigh random variable for the angular (or \(\rho\)) is equivalent to in Cartesian coordinates two independent zero-mean normal variables. This is exactly the trick behind the Box-Muller transform for generating normal random variables using just uniform random variables.

If you’re familiar with simulating the Matérn point process, the most difference is what size to make the simulation window for the parents points. I cover that in the next section.  

Overview

Simulating a Thomas cluster point process requires first simulating a homogeneous Poisson point process with intensity \(\lambda>0\) on some simulation window, such as a rectangle, which is the simulation window I will use here. Then for each point of this underlying point process, simulate a Poisson number of points with mean \(\mu>0\), and for each point simulate two independent zero-mean normal variables with variance \(\sigma^2\), corresponding to the (relative) Cartesian coordinates .

The underlying  point process is sometimes called the parent (point) process, and its points are centres of the cluster disks. The subsequent point process on all the disks is called daughter (point) process and it forms the clusters. I have already written about simulating the homogeneous Poisson point processes on a rectangle and a disk, so  those posts are good starting points, and I will not focus  too much on details for these steps steps.

Importantly, like the Matérn point process, it’s possible for daughter points to appear in the simulation window that come from parents points outside the simulation window. To handle these edge effects, the point processes must be first simulated on an extended version of the simulation window. Then only the daughter points within the simulation window are kept and the rest are removed.  

We can add a strip of some width \(d\) all around the simulation window. But what value does \(d\) take? Well, in theory, it is possible that a daughter point comes from a parent point that is very far from the simulation window. But that probability becomes vanishingly small as the distance increases, due to the daughter points being located according to zero-mean normal random variables. 

For example, if a single parent is at a distance \(d=6 \sigma\) from the simulation, then there is about a $1/1 000 000 000$ chance that a single daughter point will land in the simulation window. The probability is simply \(1-\Phi(6 \sigma)\), where \(\Phi\) is the cumulative distribution function of a normal variable with zero mean and standard deviation \(\sigma>0\).  This is what they call a six sigma event.  In my code, I set \(d=6 \sigma\), but \(d=4 \sigma\) is good enough, which is the value that the R library spatstat uses by default.  

Due to this approximation, this simulation cannot be called a perfect simulation, despite the approximation being highly accurate. In practice, it will not have no measurable effect on simulation results, as the number of simulations will rarely be high enough for (hypothetical) daughter points to come from (hypothetical) parent points outside the window. 

Steps

Number of points

Simulate the underlying or parent Poisson point process on the rectangle with \(N_P\) points. Then for each point, simulate a Poisson number of points, where each disk now has \(D_i\) number of points. Then the total number of points is simply \(N=D_1+\dots +D_{P}=\sum_{i=1}^{N_P}D_i \). The random variables \(P\) and \(D_i\) are Poisson random variables with respective means \(\lambda A\) and \(\mu\), where \(A\) is the area of the rectangular simulation window. To simulate these random variables in MATLAB, use the poissrnd function. To do this in R, use the  standard function rpois. In Python, we can use either functions scipy.stats.poisson or numpy.random.poisson from the SciPy or NumPy libraries.

Locations of points

The points of the parent point process are randomly positioned by using Cartesian coordinates. For a homogeneous Poisson point process, the \(x\) and \(y\) coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in an earlier post.

As mentioned in the introduction of this post, the points of all the daughter point process are randomly positioned by either using polar  coordinates or Cartesian coordinates, due to the Box-Muller transform. But because we ultimately convert back to Cartesian coordinates (for example, to plot the points), we will work entirely in this coordinate system.  Each point is then simply positioned with two independent zero-mean normal random variables, representing the \(x\) and \(y\) coordinates relative to the original parent point.

Shifting all the points in each cluster disk

In practice (that is, in the code), all the daughter points are simulated relative to the origin. Then for each cluster disk, all the points need to be shifted, so the origin coincides with the parent point, which completes the simulation step.

To use vectorization in the code,  the coordinates of each cluster point are repeated by the number of daughters in the corresponding cluster by using the functions repelem in MATLAB, rep in R, and repeat in Python. 

Code

I have implemented the simulation procedure in MATLAB, R and Python, which as usual are all very similar. The code can be downloaded here.

MATLAB
 %Simulation window parameters xMin=-.5; xMax=.5; yMin=-.5; yMax=.5; %Parameters for the parent and daughter point processes lambdaParent=10;%density of parent Poisson point process lambdaDautgher=100;%mean number of points in each cluster sigma=0.05;%sigma for normal variables (ie random locations) of daughters %Extended simulation windows parameters rExt=7*sigma; %extension parameter -- use factor of deviation %for rExt, use factor of deviation sigma eg 6 or 7 xMinExt=xMin-rExt; xMaxExt=xMax+rExt; yMinExt=yMin-rExt; yMaxExt=yMax+rExt; %rectangle dimensions xDeltaExt=xMaxExt-xMinExt; yDeltaExt=yMaxExt-yMinExt; areaTotalExt=xDeltaExt*yDeltaExt; %area of extended rectangle %Simulate Poisson point process for the parents numbPointsParent=poissrnd(areaTotalExt*lambdaParent,1,1);%Poisson number %x and y coordinates of Poisson points for the parent xxParent=xMinExt+xDeltaExt*rand(numbPointsParent,1); yyParent=yMinExt+yDeltaExt*rand(numbPointsParent,1); %Simulate Poisson point process for the daughters (ie final poiint process) numbPointsDaughter=poissrnd(lambdaDautgher,numbPointsParent,1); numbPoints=sum(numbPointsDaughter); %total number of points %Generate the (relative) locations in Cartesian coordinates by %simulating independent normal variables xx0=normrnd(0,sigma,numbPoints,1); yy0=normrnd(0,sigma,numbPoints,1); %replicate parent points (ie centres of disks/clusters) xx=repelem(xxParent,numbPointsDaughter); yy=repelem(yyParent,numbPointsDaughter); %translate points (ie parents points are the centres of cluster disks) xx=xx(:)+xx0; yy=yy(:)+yy0; %thin points if outside the simulation window booleInside=((xx>=xMin)&(xx<=xMax)&(yy>=yMin)&(yy<=yMax)); %retain points inside simulation window xx=xx(booleInside); yy=yy(booleInside); %Plotting scatter(xx,yy); 
R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.

 #Simulation window parameters xMin=-.5; xMax=.5; yMin=-.5; yMax=.5; #Parameters for the parent and daughter point processes lambdaParent=10;#density of parent Poisson point process lambdaDaughter=100;#mean number of points in each cluster sigma=0.05; #sigma for normal variables (ie random locations) of daughters #Extended simulation windows parameters rExt=7*sigma; #extension parameter #for rExt, use factor of deviation sigma eg 6 or 7 xMinExt=xMin-rExt; xMaxExt=xMax+rExt; yMinExt=yMin-rExt; yMaxExt=yMax+rExt; #rectangle dimensions xDeltaExt=xMaxExt-xMinExt; yDeltaExt=yMaxExt-yMinExt; areaTotalExt=xDeltaExt*yDeltaExt; #area of extended rectangle #Simulate Poisson point process for the parents numbPointsParent=rpois(1,areaTotalExt*lambdaParent);#Poisson number of points #x and y coordinates of Poisson points for the parent xxParent=xMinExt+xDeltaExt*runif(numbPointsParent); yyParent=yMinExt+yDeltaExt*runif(numbPointsParent); #Simulate Poisson point process for the daughters (ie final poiint process) numbPointsDaughter=rpois(numbPointsParent,lambdaDaughter); numbPoints=sum(numbPointsDaughter); #total number of points #Generate the (relative) locations in Cartesian coordinates by #simulating independent normal variables xx0=rnorm(numbPoints,0,sigma); yy0=rnorm(numbPoints,0,sigma); #replicate parent points (ie centres of disks/clusters) xx=rep(xxParent,numbPointsDaughter); yy=rep(yyParent,numbPointsDaughter); #translate points (ie parents points are the centres of cluster disks) xx=xx+xx0; yy=yy+yy0; #thin points if outside the simulation window booleInside=((xx&amp;amp;amp;amp;amp;gt;=xMin)&amp;amp;amp;amp;amp;amp;(xx&amp;amp;amp;amp;amp;lt;=xMax)&amp;amp;amp;amp;amp;amp;(yy&amp;amp;amp;amp;amp;gt;=yMin)&amp;amp;amp;amp;amp;amp;(yy&amp;amp;amp;amp;amp;lt;=yMax)); #retain points inside simulation window xx=xx[booleInside]; yy=yy[booleInside]; #Plotting par(pty="s") plot(xx,yy,'p',xlab='x',ylab='y',col='blue'); 

Of course, as I have mentioned before, simulating a spatial point processes in R is even easier with the powerful spatial statistics library spatstat. The Thomas cluster point process is simulated by using the function rThomas, but other cluster point processes, including Neyman-Scott types, are possible.

Python

Note: in previous posts I used the SciPy functions for random number generation, but now use the NumPy ones, but there is little difference, as SciPy builds off NumPy.

 import numpy as np;&amp;amp;amp;amp;amp;nbsp; # NumPy package for arrays, random number generation, etc import matplotlib.pyplot as plt&amp;amp;amp;amp;amp;nbsp; # For plotting # Simulation window parameters xMin = -.5; xMax = .5; yMin = -.5; yMax = .5; # Parameters for the parent and daughter point processes lambdaParent = 10; # density of parent Poisson point process lambdaDaughter = 100; # mean number of points in each cluster sigma = 0.05; # sigma for normal variables (ie random locations) of daughters #Extended simulation windows parameters rExt=7*sigma; #extension parameter #for rExt, use factor of deviation sigma eg 6 or 7 xMinExt = xMin - rExt; xMaxExt = xMax + rExt; yMinExt = yMin - rExt; yMaxExt = yMax + rExt; # rectangle dimensions xDeltaExt = xMaxExt - xMinExt; yDeltaExt = yMaxExt - yMinExt; areaTotalExt = xDeltaExt * yDeltaExt; # area of extended rectangle # Simulate Poisson point process for the parents numbPointsParent = np.random.poisson(areaTotalExt * lambdaParent);# Poisson number of points # x and y coordinates of Poisson points for the parent xxParent = xMinExt + xDeltaExt * np.random.uniform(0, 1, numbPointsParent); yyParent = yMinExt + yDeltaExt * np.random.uniform(0, 1, numbPointsParent); # Simulate Poisson point process for the daughters (ie final poiint process) numbPointsDaughter = np.random.poisson(lambdaDaughter, numbPointsParent); numbPoints = sum(numbPointsDaughter); # total number of points # Generate the (relative) locations in Cartesian coordinates by # simulating independent normal variables xx0 = np.random.normal(0, sigma, numbPoints); # (relative) x coordinaets yy0 = np.random.normal(0, sigma, numbPoints); # (relative) y coordinates # replicate parent points (ie centres of disks/clusters) xx = np.repeat(xxParent, numbPointsDaughter); yy = np.repeat(yyParent, numbPointsDaughter); # translate points (ie parents points are the centres of cluster disks) xx = xx + xx0; yy = yy + yy0; # thin points if outside the simulation window booleInside=((xx&amp;amp;amp;amp;amp;gt;=xMin)&amp;amp;amp;amp;amp;amp;(xx&amp;amp;amp;amp;amp;lt;=xMax)&amp;amp;amp;amp;amp;amp;(yy&amp;amp;amp;amp;amp;gt;=yMin)&amp;amp;amp;amp;amp;amp;(yy&amp;amp;amp;amp;amp;lt;=yMax)); # retain points inside simulation window xx = xx[booleInside]; yy = yy[booleInside]; # Plotting plt.scatter(xx, yy, edgecolor='b', facecolor='none', alpha=0.5); plt.xlabel("x"); plt.ylabel("y"); plt.axis('equal'); 
Julia 

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

The results show that the clusters of Thomas point process tend to be more blurred than those of Matérn point process, which has cluster edges clearly defined by the disks.  The points of of a Thomas point process can be far away from the centre of each cluster, depending on the variance of the normal random variables used in the simulation. 

MATLAB

R

Python

Further reading

Thomas (and Matérn) cluster point processes and more generally Neyman-Scott point processes are covered in standard books on the related fields of spatial statistics, point processes and stochastic geometry, such as the following: Spatial Point Patterns: Methodology and Applications with R by  Baddeley, Rubak and Turner, on page 461; Statistical Analysis and Modelling of Spatial Point Patterns Statistics by Illian, Penttinen, Stoyan, amd Stoyan, page 370 and Section 6.3.2; Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke,  on page 173; and; Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen, in Section 5.3. I would probably recommend the first two books for beginners. 

The Thomas point process has also appeared in models of wireless networks, which we covered in the book Stochastic Geometry Analysis of Cellular Networks by Błaszczyszyn, Haenggi, Keeler, and Mukherjee, Section 8.1.8.

More generally, Neyman-Scott point processes belong to a family of point processes called shot noise Cox point processes; see the paper by Møller. 

I mentioned above the book Spatial Point Patterns: Methodology and Applications with R , which is written by spatial statistics experts Baddeley, Rubak and Turner. It covers the spatial statistics (and point process simulation) R-package spatstat.

In my travels on the web, I found this post where the writer also simulates the Thomas and Matérn point processes in Python, independent of my code.  That code is a bit different to mine because I use the repeat function and simulate all the Poisson variables at once, instead of using a for-loop and simulating a Poisson variable for each iteration.  I also think it’s not quite correct because I do not see how they account for edge effects.  

Simulating a Matérn cluster point process

A Matérn cluster point process is a type of cluster point process, meaning that its randomly located points tend to form random clusters. (I skip the details here, but by using techniques from spatial statistics, it is possible to make the definition of clustering more precise.) This point process is an example of a family of cluster point processes known as Neyman-Scott point processes, which have been used in spatial statistics and telecommunications.

I should point out that the Matérn cluster point process should not be confused with the Matérn hard-core point process, which is a completely different type of point process. (For a research article, I have actually written code in MATLAB that simulates this type of point process.) Bertril Matérn proposed at least four types of point processes, and his name also refers to a specific type of covariance function used to define Gaussian processes.

Overview

Simulating a Matérn cluster point process requires first simulating  a homogeneous Poisson point process with an intensity \(\lambda>0\) on some simulation window, such as a rectangle, which is the simulation window I will use here. Then for each point of this underlying point process, simulate a Poisson number of points with mean \(\mu>0\) uniformly on a disk with a constant radius \(r>0\). The underlying  point process is sometimes called the parent (point) process, and its points are centres of the cluster disks.

The subsequent point process on all the disks is called daughter (point) process and it forms the clusters. I have already written about simulating the homogeneous Poisson point processes on a rectangle and a disk, so those posts are good starting points, and I will not focus  too much on details for these steps.

Edge effects

The main challenge behind sampling this point process, which I originally forgot about in an earlier version of this post, is that it’s possible for daughter points to appear in the simulation window that come from parents points outside the simulation window. In other words, parents points outside the simulation window contribute to points inside the window.

To remove these edge effects, the point processes must be simulated on an extended version of the simulation window. Then only the daughter points within the simulation window are kept and the rest are removed.  Consequently, the points are simulated on an extended window, but we only see the points inside the simulation window.

To create the extended simulation window, we can add a strip of width \(r\) all around the simulation window. Why? Well, the distance \(r\) is the maximum distance from the simulation window that a possibly contributing parent point (outside the simulation window) can exist, while still having daughter points inside the simulation window. This means it is impossible for a hypothetical parent point beyond this distance (outside the extended window) to generate a daughter point that can fall inside the simulation window. 

Steps

Number of points

Simulate the underlying or parent Poisson point process on the rectangle with \(N_P\) points. Then for each point, simulate a Poisson number of points, where each disk now has \(D_i\) number of points. Then the total number of points is simply \(N=D_1+\dots +D_{P}=\sum_{i=1}^{N_P}D_i \). The random variables \(P\) and \(D_i\) are Poisson random variables with respective means \(\lambda A\) and \(\mu\), where \(A\) is the area of the rectangular simulation window. To simulate these random variables in MATLAB, use the poissrnd function. To do this in R, use the  standard function rpois. In Python, we can use either functions scipy.stats.poisson or numpy.random.poisson from the SciPy or NumPy libraries.

Locations of points

The points of the parent point process are randomly positioned by using Cartesian coordinates. For a homogeneous Poisson point process, the \(x\) and \(y\) coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in a previous post. The points of all the daughter point process are randomly positioned by using polar  coordinates. For a homogeneous Poisson point process, the \(\theta\) and \(\rho\) coordinates of each point are independent  variables,  respectively with uniform and triangle distributions, which was covered in a previous post. Then we  convert coordinates back to Cartesian form, which is easily done in MATLAB with the pol2cart function. In languages without such a function: \(x=\rho\cos(\theta)\) and \(y=\rho\sin(\theta)\).

Shifting all the points in each cluster disk

In practice (that is, in the code), all the daughter points are simulated in a disk with its centre at the origin. Then for each cluster disk, all the points have to be shifted to the origin is the center of the cluster, which completes the simulation step.

To use vectorization in the code,  the coordinates of each cluster point are repeated by the number of daughters in the corresponding cluster by using the functions repelem in MATLAB, rep in R, and repeat in Python. 

Code

I’ll now give some code in MATLAB, R and Python, which you can see are all very similar. It’s also located here.

MATLAB

The MATLAB code is located here.

R

The R code is located here.

Of course, as I have mentioned before, simulating a spatial point processes in R is even easier with the powerful spatial statistics library spatstat. The Matérn cluster point process is simulated by using the function rMatClust, but other cluster point processes, including Neyman-Scott types, are possible.

Python

The Pyhon code is located here.

Note: in previous posts I used the SciPy functions for random number generation, but now use the NumPy ones, but there is little difference, as SciPy builds off NumPy.

Julia 

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

MATLAB

R

Python  

Further reading

Matérn cluster point processes and more generally Neyman-Scott point processes are covered in standard books on the related fields of spatial statistics, point processes and stochastic geometry, such as the following: Spatial Point Patterns: Methodology and Applications with R by  Baddeley, Rubak and Turner, on page 461; Statistical Analysis and Modelling of Spatial Point Patterns Statistics by Illian, Penttinen, Stoyan, amd Stoyan, Section 6.3.2, starting on page 370; Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke,  on page 173; and; Statistical Inference and Simulation for Spatial Point Processes by Moller and Waagepetersen, in Section 5.3. I would probably recommend the first two books for beginners.

The Matérn point process has also appeared in models of wireless networks, which we covered in the book Stochastic Geometry Analysis of Cellular Networks by Błaszczyszyn, Haenggi, Keeler, and Mukherjee, Section 8.1.8.

I mentioned above the book Spatial Point Patterns: Methodology and Applications with R , which is written by spatial statistics experts Baddeley, Rubak and Turner. It covers the spatial statistics (and point process simulation) R-package spatstat.

Determinantal thinning of point processes with network learning applications

My colleague and I uploaded a manuscript:

  • Błaszczyszyn and Keeler, Determinantal thinning of point processes with network learning applications.

https://arxiv.org/abs/1810.08672

Details

The paper uses a (relatively) new model framework in machine learning.  This framework is based on a special type of point process called a determinantal point process, which is also called a fermion point process. (This particle model draw inspiration from the form of the wave function in quantum mechanics.) Kulesza and Taskar introduced and developed the framework for using determinantal point processes for machine learning models.

Code

The MATLAB code for the producing the results in the paper can be found here:

https://github.com/hpaulkeeler/DetPoisson_MATLAB

I also re-wrote (or translated) the MATLAB code into Python:

https://github.com/hpaulkeeler/DetPoisson_Python

Simulating a Poisson point process on a triangle

The title gives it away. But yes, after  two posts about simulating a Poisson point process  on a rectangle and disk, the next shape is a triangle. Useful, right?

Well, I actually had to do this once for a part of something larger. You can divide  polygons, regular or irregular, into triangles, which is often called triangulation, and there is much code that does triangulation.  Using the independence property of the Poisson process, you can then simulate a Poisson point process on each triangle, and you end up with a Poisson point process on a polygon.

But the first step is to do it on a triangle. Consider a general triangle with its three corners labelled \(\textbf{A}\), \(\textbf{B}\) and \(\textbf{C}\). Again, simulating a Poisson point process comes down to the number of points and the locations of points.

Method

Number of points

The number of points of a homogeneous Poisson point process  in any shape with with area \(A\) is simply a Poisson random variable with mean  \(\lambda A\), where \(A\) is the area of the shape. For the triangle’s area, we just uses Herron’s formula, which says that a triangle with sides \(a\), \(b\), and \(c\)  has the area \(A=\sqrt{s(s-a)(s-b)(s-c)}\), where \(s=(a+b+c)/2\), which means you just need to use Pythagoras’ theorem for  the lengths \(a\), \(b\), and \(c\). Interestingly, this standard formula can be prone to numerical error if the triangle is very thin or needle-shaped. A more secure and stable expression is

\(A=\frac{1}{4}\sqrt{ (a+(b+c)) (c-(a-b)) (c+(a-b)) (a+(b-c)) },\)

where the brackets do matter; see Section 2.5 in the book by Higham.

Of course in MATLAB you can just use the function polyarea or the function with the same name in R.

Now just generate or simulate Poisson random variables with mean (or parameter)  \(\lambda A\). In MATLAB,  use the poissrnd function with the argument \(\lambda A\). In R, it is done similarly with the standard  function rpois . In Python, we can use either the scipy.stats.poisson or numpy.random.poisson function from the SciPy or NumPy libraries.

Locations of points

We need to position all the points randomly and uniformly on a triangle.  As with the previous two simulation cases, to position a single point \((x, y)\), you first need to produce two random uniform variables on the unit interval \((0,1)\), say \(U\) and \(V\). I’ll denote the \(x\) and \(y\) coordinates of point by \(x_{\textbf{A}}\) and \(y_{\textbf{A}}\), and similarly for the other two points.  To get the random \(x\) and \(y\) values, you use these two formulas:

\(x=\sqrt{U} x_{\textbf{A}}+\sqrt{U}(1-V x_{\textbf{B}})+\sqrt{U}V x_{\textbf{C}}\)

\(y=\sqrt{U} y_{\textbf{A}}+\sqrt{U}(1-V y_{\textbf{B}})+\sqrt{U}V y_{\textbf{C}}\)

Done. A Poisson point process simulated on a triangle .

Code

I now present some code in MATLAB, R and Python, which you can see are all very similar.  To avoid using a for-loop and employing instead MATLAB’s inbuilt vectorization, I use the dot notation for the product \(\sqrt{U}V\). In R and Python (using SciPy), that’s done automatically.

MATLAB


%Simulation window parameters -- points A,B,C of a triangle
xA=0;xB=1;xC=1; %x values of three points
yA=0;yB=0;yC=1; %y values of three points

%Point process parameters
lambda=100; %intensity (ie mean density) of the Poisson process

%calculate sides of trinagle
a=sqrt((xA-xB)^2+(yA-yB)^2);
b=sqrt((xB-xC)^2+(yB-yC)^2);
c=sqrt((xC-xA)^2+(yC-yA)^2);
s=(a+b+c)/2; %calculate semi-perimeter

%Use Herron's forumula -- or use polyarea
areaTotal=sqrt(s*(s-a)*(s-b)*(s-c)); %area of triangle

%Simulate Poisson point process
numbPoints=poissrnd(areaTotal*lambda);%Poisson number of points
U=(rand(numbPoints,1));%uniform random variables
V=(rand(numbPoints,1));%uniform random variables

xx=sqrt(U)*xA+sqrt(U).*(1-V)*xB+sqrt(U).*V*xC;%x coordinates of points
yy=sqrt(U)*yA+sqrt(U).*(1-V)*yB+sqrt(U).*V*yC;%y coordinates of points

%Plotting
scatter(xx,yy);
xlabel('x');ylabel('y');

R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.


#Simulation window parameters -- points A,B,C of a triangle
xA=0;xB=1;xC=1; #x values of three points
yA=0;yB=0;yC=1; #y values of three points

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#calculate sides of trinagle
a=sqrt((xA-xB)^2+(yA-yB)^2);
b=sqrt((xB-xC)^2+(yB-yC)^2);
c=sqrt((xC-xA)^2+(yC-yA)^2);
s=(a+b+c)/2; #calculate semi-perimeter

#Use Herron's forumula to calculate area
areaTotal=sqrt(s*(s-a)*(s-b)*(s-c)); #area of triangle

#Simulate a Poisson point process
numbPoints=rpois(1,areaTotal*lambda);#Poisson number of points
U=runif(numbPoints);#uniform random variables
V=runif(numbPoints);#uniform random variables

xx=sqrt(U)*xA+sqrt(U)*(1-V)*xB+sqrt(U)*V*xC;#x coordinates of points
yy=sqrt(U)*yA+sqrt(U)*(1-V)*yB+sqrt(U)*V*yC;#y coordinates of points

#Plotting
plot(xx,yy,'p',xlab='x',ylab='y',col='blue');

Simulating a Poisson point process in R is even easier, with the amazing spatial statistics library spatstat. You just need to define the triangular window.

#Simulation window parameters -- points A,B,C of a triangle
xA=0;xB=1;xC=1; #x values of three points
yA=0;yB=0;yC=1; #y values of three points

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

library("spatstat");
#create triangle window object
windowTriangle=owin(poly=list(x=c(xA,xB,xC), y=c(yA,yB,yC)));
#create Poisson "point pattern" object
ppPoisson=rpoispp(lambda,win=windowTriangle)
plot(ppPoisson); #Plot point pattern object
#retrieve x/y values from point pattern object
xx=ppPoisson$x;
yy=ppPoisson$y;

Python

Note: “lambda” is a reserved word in Python (and other languages), so I have used “lambda0” instead.

#import libraries
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

#Simulation window parameters -- points A,B,C of a triangle
xA=0;xB=1;xC=1; #x values of three points
yA=0;yB=0;yC=1; #y values of three points

#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process

#calculate sides of trinagle
a=np.sqrt((xA-xB)**2+(yA-yB)**2);
b=np.sqrt((xB-xC)**2+(yB-yC)**2);
c=np.sqrt((xC-xA)**2+(yC-yA)**2);
s=(a+b+c)/2; #calculate semi-perimeter

#Use Herron's forumula to calculate area -- or use polyarea
areaTotal=np.sqrt(s*(s-a)*(s-b)*(s-c)); #area of triangle

#Simulate a Poisson point process
numbPoints = scipy.stats.poisson(lambda0*areaTotal).rvs();#Poisson number of points
U = scipy.stats.uniform.rvs(0,1,((numbPoints,1)));#uniform random variables
V= scipy.stats.uniform.rvs(0,1,((numbPoints,1)));#uniform random variables

xx=np.sqrt(U)*xA+np.sqrt(U)*(1-V)*xB+np.sqrt(U)*V*xC;#x coordinates of points
yy=np.sqrt(U)*yA+np.sqrt(U)*(1-V)*yB+np.sqrt(U)*V*yC;#y coordinates of points

#Plotting
plt.scatter(xx,yy, edgecolor='b', facecolor='none', alpha=0.5 );
plt.xlabel("x"); plt.ylabel("y");

Results

MATLAB

R

Python

Further reading

The topic of placing a single point uniformly on a general triangle is discussed in this StackExchange post.  For the formulas, it cites the paper “Shape distributions” by Osada, Funkhouser, Chazelle and Dobkin”, where no proof is given.

I originally looked at placing single points in cells of a Dirichlet or Voronoi tesselation — terms vary. There is a lot of literature on this topic, particularly when the seeds of the cells form a Poisson point process. The references in the articles on Wikipedia and MathWorld are good starting points.

Correction

In a previous version of this blog, there was an error in the two Cartesian formula for randomly simulating a point in a triangle. This has been fixed, but the error never existed in the code.

Simulating a Poisson point process on a disk

Sometimes one needs to simulate a Poisson point process on a disk. I know I often do. A disk or disc, depending on your spelling preference, is isotropic or rotationally-invariant, so a lot of my simulations of a Poisson point process happen in a circular simulation window when I am considering such a setting. For example, maybe you want to consider a single wireless receiver in a Poisson network of wireless transmitters, which only cares about the distance to a transmitter. Alternatively, maybe you want to randomly sprinkle a virtual cake. What to do? A Poisson point process on a disk is the answer.

In this post I will simulate a Poisson point process with intensity \(\lambda>0\) on a disk with radius \(r>0\).

Steps

The simulation steps are very similar to those in the previous post where I simulated a  homogeneous Poisson point process on a rectangle, and I suggest going back to that post if you are not familiar with the material. The main difference between simulation on a rectangle and a disk is the positioning of the points, but first we’ll look at the number of points.

Number of points

The number of points of a Poisson point process falling within a circle of radius \(r>0\) is a Poisson random variable with mean  \(\lambda A\), where \(A=\pi r^2\) is the area of the disk. As in the rectangular case, this is the most complicated part of the simulation procedure. But as long as your preferred programming language can produce (pseudo-)random numbers according to a Poisson distribution, you can simulate a homogeneous Poisson point process on a disk.

To do this in MATLAB,  use the poissrnd function with the argument \(\lambda A\). In R, it is done similarly with the standard  function rpois . In Python, we can use either the scipy.stats.poisson or numpy.random.poisson function from the SciPy or NumPy libraries.

Locations of points

We need to position all the points randomly and uniformly on a disk. In the case of the rectangle, we worked in Cartesian coordinates. It is then natural that we now work in polar coordinates.  I’ll denote the angular and radial coordinate respectively by \(\theta\) and \(\rho\). To generate the random angular (or \(\theta\)) values, we simply produce uniform random variables between zero and one, which is what all standard (pseudo-)random number generators do in programming languages. But we then multiply all these numbers by \(2\pi\), meaning that all the numbers now fall between \(0\) and \(2\pi\).

To generate the random radial (or \(\rho\)) values, a reasonable guess would be to do the same as before and generate uniform random variables between zero and one, and then multiply them by the disk radius \(r\). But that would be wrong.

Before multiplying uniform random variables by the radius, we must take the square root of all the random numbers. We then multiply them by the radius, generating random variables between \(0\) and \(r\). (We must take the square root because the area element of a sector or disk is proportional to the radius squared, and not the radius.) These random numbers do not have a uniform distribution, due to the square root, but in fact their distribution is an example of the triangular distribution, which is defined with three real-valued parameters \(a\), \(b\) and \(c\), and for our case, set \(a=0\) and \(b=c=r\).

In summary, if \(U\) and \(V\) are two independent uniform random variables on \((0,1)\), then random point located uniformly on a disk of radius \(r\) has the polar coordinates \((r\sqrt(U), 2\pi V)\).

From polar to Cartesian coordinates

That’s it. We have generated polar coordinates for points randomly and uniformly located in a disk. But to plot the points, often we have to convert coordinates back to Cartesian form. This is easily done in MATLAB with the pol2cart function. In languages without such a function, trigonometry comes to the rescue: \(x=\rho\cos(\theta)\) and \(y=\rho\sin(\theta)\).

Equal x and y axes

Sometimes the plotted points more resemble points on an ellipse than a disk due to the different scaling of the x and y axes. To fix this in MATLAB, run the command: axis square. In Python, set axis(‘equal’) in your plot; see this page for a demonstration.

Code

I’ll now give some code in MATLAB, R and Python, which, as you can see, are all very similar

MATLAB
%Simulation window parameters
r=1; %radius of disk
xx0=0; yy0=0; %centre of disk

areaTotal=pi*r^2; %area of disk
 
%Point process parameters
lambda=100; %intensity (ie mean density) of the Poisson process
 
%Simulate Poisson point process
numbPoints=poissrnd(areaTotal*lambda);%Poisson number of points
theta=2*pi*(rand(numbPoints,1)); %angular coordinates
rho=r*sqrt(rand(numbPoints,1)); %radial coordinates

%Convert from polar to Cartesian coordinates
[xx,yy]=pol2cart(theta,rho); %x/y coordinates of Poisson points

%Shift centre of disk to (xx0,yy0)
xx=xx+xx0;
yy=yy+yy0;
 
%Plotting
scatter(xx,yy);
xlabel('x');ylabel('y');
axis square;
R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.

#Simulation window parameters
r=1; #radius of disk
xx0=0; yy0=0; #centre of disk

areaTotal=pi*r^2; #area of disk

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints=rpois(1,areaTotal*lambda);#Poisson number of points
theta=2*pi*runif(numbPoints);#angular  of Poisson points
rho=r*sqrt(runif(numbPoints));#radial coordinates of Poisson points

#Convert from polar to Cartesian coordinates
xx=rho*cos(theta);
yy=rho*sin(theta);

#Shift centre of disk to (xx0,yy0)
xx=xx+xx0;
yy=yy+yy0;

#Plotting
par(pty="s")
plot(xx,yy,'p',xlab='x',ylab='y',col='blue');

Of course, with the amazing spatial statistics library spatstat, simulating a Poisson point process in R is even easier.

library("spatstat"); #load spatial statistics library
#create Poisson "point pattern" object
ppPoisson=rpoispp(lambda,win=disc(radius=r,centre=c(xx0,yy0))) 
plot(ppPoisson); #Plot point pattern object
#retrieve x/y values from point pattern object
xx=ppPoisson$x; 
yy=ppPoisson$y;

Actually, you can even do it all in two lines: one for loading the spatstat library and one for creating and plotting the point pattern object.

Python

Note: “lambda” is a reserved word in Python (and other languages), so I have used “lambda0” instead.

import numpy as np; #NumPy package for arrays, random number generation, etc
import matplotlib.pyplot as plt #for plotting

#Simulation window parameters
r=1;  #radius of disk
xx0=0; yy0=0; #centre of disk
areaTotal=np.pi*r**2; #area of disk

#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints = np.random.poisson(lambda0*areaTotal);#Poisson number of points
theta=2*np.pi*np.random.uniform(0,1,numbPoints); #angular coordinates 
rho=r*np.sqrt(np.random.uniform(0,1,numbPoints)); #radial coordinates 

#Convert from polar to Cartesian coordinates
xx = rho * np.cos(theta);
yy = rho * np.sin(theta);

#Shift centre of disk to (xx0,yy0) 
xx=xx+xx0; yy=yy+yy0;

#Plotting
plt.scatter(xx,yy, edgecolor='b', facecolor='none', alpha=0.5 );
plt.xlabel("x"); plt.ylabel("y");
plt.axis('equal');
Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

MATLAB
R 
Python  

Further reading

The third edition of the classic book Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke details on page 54 how to uniformly place points on a disk, which they call the radial way. The same simulation section appears in the previous edition by Stoyan, Kendall and Mecke (Chiu didn’t appear as an author until the current edition), though these books in general have little material on simulation methods. There is the book Spatial Point Patterns: Methodology and Applications with R written by spatial statistics experts  Baddeley, Rubak and Turner, which covers the spatial statistics (and point process simulation) R-package spatstat.

Simulating a homogeneous Poisson point process on a rectangle

This is the first of a series of posts about simulating Poisson point processes. I’ll start with arguably the simplest Poisson point process on two-dimensional space, which is the homogeneous one defined on a rectangle. Let’s say that we we want to simulate a Poisson point process with intensity \(\lambda>0\) on a (bounded) rectangular region, for example, the rectangle \([0,w]\times[0,h]\) with dimensions \(w>0\) and \(h>0\) and area \(A=wh\). We assume for now that the bottom left corner of the rectangle is at the origin.

Steps

Number of points

The number of points in the rectangle  \([0,w]\times[0,h]\) is a Poisson random variable with mean \(\lambda A\). In other words, this random variable is distributed according to the Poisson distribution with parameter \(\lambda A\), and not just \(\lambda\), because the number of points depends on the size of the simulation region.

This is the most complicated part of the simulation procedure. As long as your preferred programming language can produce (pseudo-)random numbers according to a Poisson distribution, you can simulate a homogeneous Poisson point process. There’s a couple of different ways used to simulate Poisson random variables, but we will skip the details. In MATLAB, it is done by using the poissrnd function with the argument \(\lambda A\). In R, it is done similarly with the standard  function rpois . In Python, we can use either the scipy.stats.poisson or numpy.random.poisson function from the SciPy or NumPy libraries.

Location of points

The points now need to be positioned randomly, which is done by using Cartesian coordinates. For a homogeneous Poisson point process, the \(x\) and \(y\) coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in a previous post. For the rectangle \([0,w]\times[0,h]\), the \(x\) coordinates are uniformly sampled on the interval \([0,w]\), and similarly for the \(y\) coordinates. If the bottom left corner of rectangle is located at the point \((x_0,y_0)\), then we just have to shift the random \(x\) and \(y\) coordinates by respectively adding \(x_0\) and \(y_0\).

Every scientific programming language has a random uniform number generator because it is the default random number generator. In MATLAB, R and SciPy, it is respectively rand, runif and scipy.stats.uniform.

Code

Here is some code that I wrote for simulating a homogeneous Poisson point process on a rectangle. You will notice that in all the code samples the part that simulates the Poisson point process requires only three lines of code: one line for the number of points and two lines lines for the \(x\) and \(y\) coordinates  of the points.

MATLAB
%Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; %rectangle dimensions
areaTotal=xDelta*yDelta;

%Point process parameters
lambda=100; %intensity (ie mean density) of the Poisson process

%Simulate Poisson point process
numbPoints=poissrnd(areaTotal*lambda);%Poisson number of points
xx=xDelta*(rand(numbPoints,1))+xMin;%x coordinates of Poisson points
yy=yDelta*(rand(numbPoints,1))+yMin;%y coordinates of Poisson points

%Plotting
scatter(xx,yy);
xlabel('x');ylabel('y');
R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.

#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

#Point process parameters
lambda=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints=rpois(1,areaTotal*lambda);#Poisson number of points
xx=xDelta*runif(numbPoints)+xMin;#x coordinates of Poisson points
yy=yDelta*runif(numbPoints)+yMin;#y coordinates of Poisson points

#Plotting
plot(xx,yy,'p',xlab='x',ylab='y',col='blue');
Python

Note: “lambda” is a reserved word in Python (and other languages), so I have used “lambda0” instead.

import numpy as np
import scipy.stats
import matplotlib.pyplot as plt

#Simulation window parameters
xMin=0;xMax=1;
yMin=0;yMax=1;
xDelta=xMax-xMin;yDelta=yMax-yMin; #rectangle dimensions
areaTotal=xDelta*yDelta;

#Point process parameters
lambda0=100; #intensity (ie mean density) of the Poisson process

#Simulate Poisson point process
numbPoints = scipy.stats.poisson( lambda0*areaTotal ).rvs()#Poisson number of points
xx = xDelta*scipy.stats.uniform.rvs(0,1,((numbPoints,1)))+xMin#x coordinates of Poisson points
yy = yDelta*scipy.stats.uniform.rvs(0,1,((numbPoints,1)))+yMin#y coordinates of Poisson points
#Plotting
plt.scatter(xx,yy, edgecolor='b', facecolor='none', alpha=0.5 )
plt.xlabel("x"); plt.ylabel("y")
Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

Results

MATLAB

PoissonMatlab

R

PoissonR

Python

PoissonPython

Higher dimensions

If you want to simulate a Poisson point process in a three-dimensional box (typically called a cuboid or rectangular prism), you just need two modifications.

For a box \([0,w]\times[0,h]\times[0,\ell]\), the number of points now a Poisson random variable with mean \(\lambda V\), where \(V= wh\ell\) is the volume of the box. (For higher dimensions, you need to use $n$-dimensional volume.)

To position the points in the box, you just need an additional uniform variable for the extra coordinate. In other words, the \(x\), \(y\) and \(z\) coordinates are uniformly and independently sampled on the respective intervals \([0,w]\), \([0,h]\), \([0,\ell]\). The more dimensions, the more uniform random variables.

And that’s it.

Further reading

For simulation of point processes, see, for example, the books Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen, or Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke.

There are books written by spatial statistics experts such as Stochastic Simulation by Ripley and Spatial Point Patterns: Methodology and Applications with R by Baddeley, Rubak and Turner, where the second book covers the spatial statistics R-package spatstat. Kroese and Botev also have a good introduction in the edited collection Stochastic Geometry, Spatial Statistics and Random Fields : Models and Algorithms by Schmidt, where the relevant chapter (number 12) is also freely available online.

More general stochastic simulation books that cover relevant material include Uniform Random Variate Generation by Devroye and Stochastic Simulation: Algorithms and Analysis by Asmussen and Glynn.

Point process simulation

Point processes are mathematical objects that can represent a collection of randomly located points on some underlying space. There is a rich range of these random objects, and, depending on the type of points process, there are various steps and methods used to sample, simulate or generate them on computers. (I use these three terms somewhat interchangeably.) This is the first of a series of posts in which I will describe how to simulate some of the more tractable point processes defined on bounded regions of two-dimensional Euclidean space.

Finite point processes

A general family of point processes are the finite point processes, which are, as the name suggests, simply point processes with the property that the total number of points is finite with probability one. Usually it is assumed that both the number of points and the positions of these points are random. These two properties give a natural way to describe and to study point processes mathematically, while also giving an intuitive way to simulate them on computers.

If the number and locations of points can be simulated sequentially, such that the number of points is naturally generated first followed by the placing of the points, then the simulation process is usually easier. Such a point process is a good place to start for an example.

Simple example — Binomial point process

The binomial point process is arguably the simplest point process. It is created by scattering \(n\) points uniformly and independently located in some bounded region, say, \(R\) with area \(|R|\). The number of points located in a region \(B\subset R\) is a binomial random variable, say, \(N\), where its probability parameter \(p=\frac{|B|}{|R|}\) is simply the ratio of the  two areas. This implies that

$$ P(N=k)={n\choose k} p^k(1-p)^{n-k}, $$

The total number of points is non-random number \(n\), so we do not need to generate it as it is given. The independence of point locations means that all the points can be positioned in parallel.

For example, to simulate \(n\) points of a binomial point process on the unit square \([0,1]\times[0,1]\), we just need to independently sample the \(x\) and \(y\) coordinates of the points from a uniform distribution on the unit interval \([0,1]\). In other words, we randomly generate or sample two sets of \(n\) uniform random variables corresponding to the \(x\) and \(y\) of the \(n\) points.

To sample on a general \(w \times h\) rectangle, we just need need to multiple the random \(x\) and \(y\) values by the respective dimensions of the rectangle \(w\) and \(h\). Of course the rectangle can also be shifted up or down by adding or subtracting \(x\) and \(y\) values appropriately.

Essentially every programming language has a function for generating uniform random numbers because it is the default random number generator, which all the other random number generators build off, by employing various techniques like applying transformations. In MATLAB, R and Python, it is respectively rand, runif and scipy.stats.uniform, which all generate uniform points on the open interval \((0,1)\).

Code

Here are some pieces of code that illustrates how to simulate a binomial point process on the unit square. I suggest downloading the code directly here from my repository. (Sometimes my webpage editor mangles simulation code.)

MATLAB

n=10; %number of points

%Simulate binomial point process
xx=rand(n,1);%x coordinates of binomial points
yy=rand(n,1);%y coordinates of binomial points

%Plotting
scatter(xx,yy);
xlabel('x');ylabel('y');

R

n=10; #number of points
#Simulate Binomial point process
xx=runif(n);#x coordinates of binomial points
yy=runif(n);#y coordinates of binomial points

#Plotting
plot(xx,yy,'p',xlab='x',ylab='y',col='blue');

Python

import numpy as np
import matplotlib.pyplot as plt

numbPoints = 10; # number of points

# Simulate binomial point process
xx = np.random.uniform(0, 1, numbPoints) # x coordinates of binomial points
yy = np.random.uniform(0, 1, numbPoints) # y coordinates of binomial points

# Plotting
plt.scatter(xx, yy, edgecolor='b', facecolor='none', alpha=0.5)
plt.xlabel('x')
plt.ylabel('y')
plt.show()

More complicated point processes

The binomial point process is very simple. One way to have a more complicated point process is to have a random number of points. The binomial point process is obtained from conditioning on the number of points of a homogeneous Poisson point process, implying its simulation builds directly off the binomial point process. The homogeneous Poisson point process is arguably the most used point process, and it is then in turned used to construct even more complicated points processes. This suggests that the homogeneous Poisson point process is the next point process we should try to simulate.

Further reading

For simulation of point processes, see, for example, the books Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen, or Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke. There are books written by spatial statistics experts such as Stochastic Simulation by Ripley and Spatial Point Patterns: Methodology and Applications with R by Baddeley, Rubak and Turner, where the second book covers the spatial statistics R-package spatstat. Kroese and Botev also have a good introduction in the edited collection Stochastic Geometry, Spatial Statistics and Random Fields : Models and Algorithms by Schmidt, where the relevant chapter (number 12) is also freely available online. More general stochastic simulation books that cover relevant material include Uniform Random Variate Generation by Devroye and Stochastic Simulation: Algorithms and Analysis by Asmussen and Glynn.

For a mathematical introduction to finite point processes, see the standard reference An Introduction to the Theory of Point Processes by Daley and Vere-Jones (Chapter 5 in either the single-volume edition or the first volume of the second two-volume edition). Finite point processes are also covered in a recent tutorial on Palm calculus and Gibbs point processes.

Poisson point process

Balloons in scattered across a morning sky. Sand grains strewn on the ground. Seeds blown over a forest floor. Each of these phenomena can be represented mathematically as an object called a point process or random point field. Although it has a couple of mathematical interpretations, a point process is essentially just a collection of points randomly scattered on some mathematical space*, such as the real line, the Cartesian plane, a sphere, or more abstract spaces.

*The underlying mathematical space is sometimes called the carrier space or the state space, but the second term refers to something different from the state space used in the theory of stochastic processes.

The most important point process is the Poisson point process, which is one of the two most fundamental and studied mathematical objects in probability. (The other is the Wiener process or Brownian motion process, which is a type of random process or stochastic process, and it has been suggested by mathematicians such as John Kingman that the Poisson point process does not attract as much research attention as it should.) This point process can be defined on very general mathematical spaces, but usually the plane gives sufficient intuition. In this setting, each randomly located point can represent, for example, a star, a sand grain or a seed.

The most important defining property of the Poisson point process is that the numbers of points of the point process located in two (or more) non-overlapping (that is, disjoint) regions are two or more independent random variables. This property, sometimes called independent scattering or complete independence, explains the tremendous tractability of this point process, and it is used alongside the property that the random variables have Poisson distributions to define the Poisson point process.

To define a Poisson point process on some mathematical space, only a single mathematical object is needed. This object is applied to a region (or subset) of the underlying space on which the Poisson point process is defined, and returns a non-negative number. This object is a type of measure from measure theory, so it is called the mean measure or intensity measure. The mean measure can be interpreted as the mean or average number of points of a Poisson point process being located in a region of the underlying space.

Definition of a Poisson point process

A point process \(N\) defined on some underlying space \(S\) is a Poisson point process with intensity measure \(\Lambda\) if it has the two following properties:

1 The number of points in a bounded Borel set \(B \subset S\) is a Poisson random variable with mean \(\Lambda(B)\).

2 The number of points in \(n\) disjoint Borel sets forms \(n\) independent random variables.

A simple example of a mean measure of a Poisson point process is when the mean measure is given by the product of a non-negative constant and the area or volume of the region. The constant, often denoted by \(\lambda\), is known as the intensity or rate, which is often can be interpreted as the average density of points. In this setting, the average density does not vary over the underlying space, so the resulting point process is called a homogeneous Poisson point process or uniform Poisson point process, which is the simplest example of a Poisson point process.

If the intensity does change over the underlying space, meaning it is spatially dependent, then the terms inhomogeneous Poisson point process or nonhomogeneous Poisson point process are used. It is usually assumed that the intensity measure \(\Lambda\) has a derivative \(\lambda\), so it can be written as an integral:

$$\Lambda(B)=\int_B \lambda(x) dx, $$

where the set \(B\) is some subregion of the underlying state space \(S\). (As per standard probability assumptions, the set \(B\) has to be Borel measurable, but we do not focus on such details here.)

The Poisson point process is the cornerstone of fields where randomness meets geometry, such as spatial statistics, geometric probability and stochastic geometry. Researchers, scientists, and engineers have proposed using the Poisson point process to model various objects randomly positioned. In recent years, it has been used extensively to mathematically model the locations of transmitters and receivers in wireless communication networks such as cellular or mobile phone networks.

As a mathematical model, the Poisson point process should be used to represent objects or phenomena that have little or, ideally, zero interaction among the points. If that’s not the case, then the Poisson point process can also serve as a null-hypothesis model in statistics, whose rejection implies there is sufficiently strong interaction among the points. For example, the stars influence each other, undoubtedly, through gravity, and trees rely upon absorbing water in their vicinity through root systems, suggesting that non-Poisson models would be more suitable for representing these two examples. Other more sophisticated point processes that incorporate such point interaction have been developed. Many of these point processes build off the Poisson point process.

The Poisson point process is often called simply the Poisson process, where it can be confused with the related stochastic process of the same name. This Poisson process is a continuous-time discrete-valued stochastic process. The points in time where this stochastic process changes (or jumps) form the points of a Poisson point process on the real line. Depending on the literature, interpretation and preference, the Poisson point process is also called the Poisson random field and Poisson random measure.

The Poisson point process is a highly useful and used random object. But we now need to simulate it on a computer, which will be the subject of a future post.

Further reading

The Wikipedia article is a good starting point. The best book on the Poisson point process is the monograph Poisson Processes by Kingman. A recent and more theoretical book is Lecture Notes on the Poisson Process by Last and Penrose (the manuscript is freely available online here). More applied books include the classic Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke, and Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen.

In recent years, there have been various books applying the Poisson point process to wireless networks, which include the two-volume Stochastic geometry and Wireless Networks by Baccelli and Błaszczyszyn, and Stochastic geometry for Wireless Networks by Haenggi. Much more recently, some colleagues and I contributed to this literature with Stochastic Geometry Analysis of Cellular Networks by Błaszczyszyn, Haenggi, Keeler, and Mukherjee.

An Improbable Start

This web log or blog, to the use parlance of our times, is a place for me to discuss and explains problems or research ideas that I am working on or just find interesting. The emphasis will be on words over equations, with the aim of trying to give an intuitive explanation for mathematical concepts encountered in applied probability.

Much of my work involves the use of random simulations, which are also called stochastic or Monte Carlo simulations, so I will often be posting on ideas illustrated with simulations. Most of my experience is using MATLAB, so that will be my default programming language, but I also have experience in the statistics-focused language R, which arguably has the best spatial statistics package spatstat going around. I also use Python coupled with appropriate libraries such as NumPy, especially for machine learning work.

I am considering learning other languages to do random simulation work. A possible candidate here is the relatively new Julia language, although nothing seems to compete against MATLAB in terms of user friendliness.

Feel free to contact me for questions or to point out mistakes. I do appreciate it. I may even reply.

Block arrivals in the Bitcoin blockchain

My collaborators and I recently uploaded a preprint:

  • Bowden, Keeler, Krzesinski, Taylor, Block arrivals in the Bitcoin blockchain, 2018.

The hint is in the title. In this work we study the arrivals of blocks in the Bitcoin blockchain by using both empirical (statistical) methods and mathematical (stochastic) models. To do this, we set up Bitcoin nodes around the world and collected blockchain mining data.

One of the main observations is that, according to standard tests, Bitcoin blocks do not arrive according to a Poisson stochastic process. The Poisson assumption, which seems very reasonable, appeared in the original white paper by the pseudonymous Satoshi Nakamoto:

  • Nakamoto, Bitcoin: A Peer-to-Peer Electronic Cash System, 2008.

Going back to our newly uploaded work, here’s the abstract giving some more details:

Bitcoin is a electronic payment system where payment transactions are verified and stored in a data structure called the blockchain. Bitcoin miners work individually to solve a computationally intensive problem, and with each solution a Bitcoin block is generated, resulting in a new arrival to the blockchain. The difficulty of the computational problem is updated every 2,016 blocks in order to control the rate at which blocks are generated. In the original Bitcoin paper, it was suggested that the blockchain arrivals occur according to a homogeneous Poisson process. Based on blockchain block arrival data and stochastic analysis of the block arrival process, we demonstrate that this is not the case. We present a refined mathematical model for block arrivals, focusing on both the block arrivals during a period of constant difficulty and how the difficulty level evolves over time.

This new work is a spiritual sequel to a paper that some of us published a couple of years ago:

  • Göbel, Keeler, Krzesinski, Taylor, Bitcoin Blockchain Dynamics: the Selfish-Mine Strategy in the Presence of Propagation Delay, 2016.

A preprint of this paper can be found here.