## Cox point process

In previous posts I have stressed the importance of the Poisson point process, but it can be unsuitable for certain mathematical models.  We can generalize this point process by first considering a non-negative random measure, called a driving or directing measure. Then a Poisson point process, which is independent of the random driving measure, is generated by using the random measure as its intensity or mean measure. This doubly stochastic construction gives what is called a Cox point process.

Typically, we don’t observe the driving measure, so it’s impossible to distinguish a Cox point process from a Poisson point if there’s only one realization available. The properties of a Cox point process are derived by conditioning on the random driving measure, and then using the properties of the Poisson point process.

Cox point processes are also known as doubly stochastic Poisson point processes. Guttorp and Thorarinsdottir argue that it should be called the Quenouille point process, as Maurice Quenouille introduced an example of it before Sir David Cox, but I opt for the more common name.

In this post I’ll cover a couple examples of Cox point processes, but first I’ll give a more precise mathematical definition.

## Definition

We consider a point process defined on some underlying mathematical space $$\mathbb{S}$$, which is sometimes called the carrier space or state space.  For applications, the underlying space is often the real line $$\mathbb{R}$$, the plane $$\mathbb{R}^2$$, or some other familiar mathematical space like a square lattice.

For the first definition, we use the concept of a random measure.

Let $$M$$ be a non-negative random measure on $$\mathbb{S}$$. Then a point process $$\Phi$$ defined on some underlying space $$\mathbb{S}$$ is a Cox point process driven by the intensity measure $$M$$ if the conditional distribution of $$\Phi$$ is a Poisson point process with intensity function $$M$$.

Alternatively, we can give a slightly less general definition of a Cox  point process by using a random intensity function.

Let $$Z=\{Z(x):x\in\mathbb{S} \}$$ be a non-negative random field such that with probability one, $$x\rightarrow Z(x)$$ is a locally integrable function. Then a point process $$\Phi$$ defined on some underlying space $$\mathbb{S}$$ is a Cox point process driven by $$Z$$ if the conditional distribution of $$\Phi$$ is a Poisson point process with intensity function $$Z$$.

The random driving measure $$M$$ is then simply the integral
$$M(B)=\int_B Z(x)\, dx , \quad B\subseteq S.$$

## Over-dispersion

We will soon see that the random driving measures take different forms, giving different Cox point processes. But there is a general observation that can be made for all Cox point processes. For any region $$B \subseteq S$$, it can be shown that the number of points $$\Phi (B)$$ adheres to the inequality
$$\mathbb{Var} [\Phi (B)] \geq \mathbb{E} [\Phi (B)],$$

where $$\mathbb{Var} [\Phi (B)]$$ is the variance of the random variable $$\Phi (B)$$.  As a comparison, for a Poisson point process $$\Phi’$$, the variance of $$\Phi’ (B)$$ is simply $$\mathbb{Var} [\Phi’ (B)] =\mathbb{E} [\Phi’ (B)]$$.  Due to its greater variance, the Cox point process is said to be over-dispersed compared to the Poisson point process.

## Special cases

There is an virtually unlimited number of ways to define a random driving measure, where each one yields a different a Cox point process. But in general mathematicians are and practitioners are restricted by examining only tractable and interesting Cox point processes. I will give some common examples of such Cox point processes, but I stress that the Cox point process family is very large.

### Mixed Poisson point process

For the random driving measure $$M$$, an obvious example is the product form $$M= Y \mu$$,  where $$Y$$ is some independent non-negative random variable and $$\mu$$ is the Lebesgue measure on $$\mathbb{S}$$, meaning $$Y$$ is the only source of randomness. This driving measure gives the mixed Poisson point process.

### Log-Gaussian Cox point process

Instead of a random variable, we can use a non-negative random field to define a random driving measure.  We then have the product form $$M= Y \mu$$, where $$Y$$ is now some independent non-negative random field. (A random field is a collection of random variables indexed by some set, which in this case is the underlying space $$\mathbb{S}$$.)

Arguably the most tractable and used random field is the Gaussian random field. This random field, like Gaussian or normal random variables, takes negative and positive values, but if we define the random field such that its logarithm is a Gaussian field $$Z$$, then we obtain the non-negative random driving measure $$M=\mu e^Z$$, which gives the log-Gaussian Cox point process.

This point process has found applications in spatial statistics.

### Cox-Poisson line-point process

To construct this Cox point process, we first consider a Poisson line process, which I discussed previously.  Given a Poisson line process, we then place an independent one-dimensional Poisson point process on each line, then we obtain an example of a Cox point process, which we could call a Cox line-point process orCox-Poisson line-point process, but I am not sure of the name.

Researchers have recently used this point process to study wireless communication networks in cities, where the streets correspond to Poisson lines. For example, see these two preprints:

### Shot-noise Cox point process

We construct the next Cox point process by first considering a Poisson point process on the space $$\mathbb{S}$$ to create a shot noise term, which we then use as the driving measure of the Cox point process.

More specifically, we first introduce a kernel function $$k(\cdot,\cdot)$$ on $$\mathbb{S}$$, where $$k(x,\cdot)$$ is a probability density function for all points $$x\in \mathbb{S}$$. We then consider a Poisson point process $$\Phi’$$ on $$\mathbb{S}\times (0,\infty)$$, which we assume has a locally integrable intensity function $$\mu$$. (We can interpret  the point process $$\Phi’$$ as a spatially-dependent marked Poisson point process,  where the unmarked Poisson point process is defined on $$\mathbb{S}$$, and each point $$X$$ of this unmarked point process has a mark  $$T \in (0,\infty)$$ with probability density $$\mu(X,t)$$.) The resulting shot noise

$$Z(x)= \sum_{(Y,T)\in \Phi’} T \, k(Y,x)\,.$$

gives the random field, which we then use as the random intensity function to drive the shot-noise Cox point process.

In previous posts, I have detailed how to simulate non-Poisson point processes such as the Matérn and Thomas cluster point processes. These are, more specifically, examples of a Neyman-Scott point process, which is a special case of a shot noise Cox point process. All these point processes find applications in spatial statistics.

## Simulation

Unfortunately, there is no universal way to simulate all Cox point processes — and even if there were one, it would not be the most optimal way for every Cox point process. The simulation method depends on how the Cox point process is constructed, which usually means how its directing or driving measure is defined.

In previous posts I have presented ways (with code) to simulate the following Cox point processes:

In addition to the Matérn and Thomas point processes, there are ways to simulate to more general shot noise Cox point processes, which I will cover in another post.

For general Cox point processes, I suggest the following: Chapter 6 in the monograph Poisson processes by Kingman; Chapter 5 in Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen; and Section 5.2 in Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke. For a much more mathematical treatment, see Chapter 13 in Lectures on the Poisson Process by Last and Penrose. Grandell wrote two detailed monographs titled Mixed Poisson Process and Doubly Stochastic Poisson Processes.

Motivated by applications in spatial statistics, Jesper Møller has (co)-written papers on specific Cox point processes. For example:

• 2001, Møller, Syversveen, and Waagepetersen, Log Gaussian Cox Processes;
• 2003, Møller, Shot noise Cox Processes;
• 2005, Møller and Torrisi,Generalised shot noise Cox processes.

I also suggest the survey article:

• 2003, Møller and Waagepetersen, Modern statistics for spatial point processes.

## Shot noise

Given a mathematical model based on a point process, a quantity of possible interest is the sum of some function applied to each point of the point process. This random sum is called shot noise, where the name comes from developing mathematical models of the noise measured in old electronic devices, which was likened to shot (used in guns) hitting a surface.

Researchers have long studied shot noise induced by a point process. One particularly application is wireless network models, in which the interference term is an example of shot noise. It is also possible to construct new point processes, called shot noise Cox point processes, by using based on the shot noise of some initial point process.

For such applications, we need a more formal definition of shot noise.

## Definition

### Shot noise of a point process

We consider a point processes $$\Phi=\{X_i\}_i$$ defined on some space $$\mathbb{S}$$, which is often $$\mathbb{R}^n$$, and a non-negative function $$f$$ with the domain $$\mathbb{S}$$, so $$f:\mathbb{S} \rightarrow [0,\infty)$$. This function $$f$$ is called the response function.

Then the shot noise is defined as
$$I= \sum_{X_i\in \Phi} f(X_i)\,.$$

### Shot noise of a marked point process

The previous definition of shot noise can be generalized by considering a marked point process $$\Phi’=\{(X_i, M_i)\}_i$$, where each point $$X_i$$ now has a random mark $$M_i$$, which can be a random variable some other random object taking values in some space $$\mathbb{M}$$. Then for a response function $$g:\mathbb{S}\times \mathbb{M} \rightarrow [0,\infty)$$ , the shot noise is defined as
$$I’= \sum_{(X_i, M_i)\in \Phi’} g(X_i,M_i)\,.$$

## Properties

Given a point process on a space, like the plane, at any point the shot noise is simply a random variable. If we consider a subset of the space, then shot noise forms a random field, where we recall that a random field is simply a collection of random variables indexed by some set. (By convention, the set tends to be Euclidean space or a manifold). The shot noise can also be considered as a random measure, for example
$$I(B)= \sum_{X_i\in \Phi\cap B} f(X_i)\,,$$
where $$B\subseteq \mathbb{S}$$. This makes sense as the point process $$\Phi$$ is an example of a random (counting) measure.

For Poisson point processes, researchers have studied resulting shot noise random variable or field. For example, given a homogeneous Poisson point process on $$\mathbb{R}^d$$, if the response function is a simple power-law $$f(x)=|x|^{-\beta}$$, where $$\beta> d$$ and $$|x|$$ denotes the Euclidean distance from the origin, then the resulting shot noise is alpha stable random variable with parameter $$\alpha=d/\beta$$.

For a general point process $$\Phi$$ with intensity measure $$\Lambda$$, the first moment of the shot noise is simply
$$\mathbb{E}(I)= \int_{\mathbb{S}} f(x) \Lambda (dx) \,.$$

This is a result of Campbell’s theorem or formula. A similar expression exists for the shot noise of a marked point process.

## Some history

Shot noise has been studied for over a century in science. In physics, Walter Schottky did research on shot noise in Germany at the beginning of the 20th century. In the same era, Norman R. Campbell studied shot noise in Britain and wrote two key papers, where one of them contains a result now called Campbell’s theorem or Campbell’s formula, among other names, which is a fundamental result in point process theory. Campbell was a physicist, but his work contains this mathematical result for which he credited the famed pure mathematician G. H. Hardy.

(It’s interesting to note that Hardy claimed years later that, given he did pure mathematics, none of his work would lead to applications, but that claim is simply not true for this and other reasons.)

The work on the physical process of shot noise motivated more probability-oriented papers on shot noise, including:

• 1944, S. O. Rice, Mathematical Analysis of Random Noise;
• 1960, Gilbert and Pollak. Amplitude distribution of shot noise;
• 1971, Daley, The definition of a multi-dimensional generalization of shot noise;
• 1977, J. Rice, On generalized shot noise;
• 1990 Lowen and Teich, Power-law shot noise.

As a model for interference in wireless networks, shot noise is covered in books such as the two-volume textbooks Stochastic Geometry and Wireless Networks by François Baccelli and Bartek Błaszczyszyn, where the first volume is on theory and the second volume is on applications. Martin Haenggi wrote a very readable introductory book called Stochastic Geometry for Wireless networks.

## Simulating a Cox point process based on a Poisson line process

In the previous post, I described how to simulate a Poisson line process, which in turn was done by using insight from an earlier post on the Bertrand paradox.

Now, given a Poisson line process, for each line, if we generate an independent one-dimensional Poisson point point process on each line, then we obtain an example of a Cox point process. Cox point processes are also known as doubly stochastic Poisson point processes. On the topic of names, Guttorp and Thorarinsdottir argue that it should be called the Quenouille point process, as Maurice Quenouille introduced an example of it before Sir David Cox, but I opt for the more common name.

## Cox point proceesses

A Cox point process is a generalization of a Poisson point process. It is created by first considering a non-negative random measure, sometimes called a driving measure. Then a Poisson point process, which is independent of the random driving measure, is generated by using the random measure as its intensity or mean measure.

The driving measure of a Cox point process can be, for example, a non-negative random variable or field multiplied by a Lebesgue measure. In our case, the random measure is the underlying Poisson line process coupled with the Lebesgue measure on the line (that is, length).

Cox processes form a very large and general family of point processes, which exhibit clustering. In previous posts, I have covered two special cases of Cox point processes: the Matérn and Thomas cluster point processes. These are, more specifically, examples of a Neyman-Scott point process, which is a special case of a shot noise Cox point process. These two point processes are fairly easy to simulate, but that’s not the case for Cox point processes in general. Some are considerably easier than others.

## Motivation

I will focus on simulating the Cox point process formed from a Poisson line process with homogeneous Poisson point processes. I do this for two reasons. First, it’s easy to simulate, given we can simulate a Poisson line process. Second, it has been used and studied recently in the mathematics and engineering literature for investigating wireless communication networks in cities, where the streets correspond to Poisson lines; for example, see these two preprints:

Incidentally, I don’t know what to call this particular Cox point process. A Cox line-point process? A Cox-Poisson line-point process? But it doesn’t matter for simulation purposes.

## Method

We will simulate the Cox (-Poisson line-) point process on a disk. Why a disk? I suggest reading the previous posts on the Poisson line process the Bertrand paradox for why the disk is a natural simulation window for line processes.

Provided we can simulate a Poisson line process, the simulation method is quite straightforward, as I have essentially already described it.

##### Line process

First simulate a Poisson line process on a disk. We recall that for each line of the line process, we need to generate two independent random variables $$\Theta$$ and $$P$$ describing the position of the line. The first random variable $$\Theta$$ gives the line orientation, and it is a uniform random variable on the interval $$(0,2\pi)$$.

The second random variables $$P$$ gives the distance from the origin to the disk edge, and it is a uniform random variable on the interval $$(0,r)$$, where $$r$$ is the radius of the disk. The distance from the point $$(\Theta, P)$$ to the disk edge (that is, the circle) along the chord is:

$$Q=\sqrt{r^2-P^2}.$$

The endpoints of the chord (that is, the points on the disk edge) are then:

Point 1: $$X_1=P \cos \Theta+ Q\sin \Theta$$, $$Y_1= P \sin \Theta- Q\cos \Theta$$,

Point 2: $$X_2=P \cos \Theta- Q\sin \Theta$$, $$Y_2= P \sin \Theta+Q \cos \Theta$$.

The length of the line segment is $$2 Q$$. We can say this random line is described by the point $$(\Theta,P)$$.

##### One-dimensional Poisson point process

For each line (segment) in the line process, simulate a one-dimensional Poisson point process on it. Although I have never discussed how to simulate a one-dimensional (homogeneous) Poisson point process, it’s essentially one dimension less than simulating a homogeneous Poisson point process on a rectangle.

More specifically, given a line segment $$(\Theta,P)=(\theta,p)$$, you simulate a homogeneous Poisson point process with intensity $$\mu$$ on a line segment with length $$2 q$$, where $$q=\sqrt{r^2-p^2}$$. (I am now using lowercase letters to stress that the line is no longer random.) To simulate the homogeneous Poisson point process, you generate a Poisson random variable with parameter $$2 \mu q$$.

Now you need to place the points uniformly on the line segment. To do this, consider a single point on a single line. For this point, generate a single uniform variable $$U$$ on the interval $$(-1,1)$$. The tricky part is now getting the Cartesian coordinates right. But the above expressions for the endpoints suggest that the single random point has the Cartesian coordinates:

$$x=p \cos \theta+ U q\sin \theta$$, $$y=p \sin \theta- U q\cos \theta$$.

The two extreme cases of the uniform random variable $$U$$ (that is, $$U=-1$$ and $$U=1$$) correspond to the two endpoints of the line segment. We recall that $$Q$$ is the distance from the midpoint of the line segment to the disk edge along the line segment, so it makes sense that we want to vary this distance uniformly in order to uniformly place a point on the line segment. This uniform placement step is done for all the points of the homogeneous Point process on that line segment.

You repeat this procedure for every line segment. And that’s it: a Cox point process built upon a Poisson line process.

## Code

As always, the code from all my posts is online. For this post, I have written the code in MATLAB, R and Python.

For the first step, the reading material is basically the same as that for the Poisson line process, which overlaps with that of the Bertrand paradox. For the one-dimensional Poisson point process, we can use the reading material on the homogeneous Poisson point process on a rectangle.

For general Cox point processes, I recommend starting with the following: Chapter 6 in the monograph Poisson processes by Kingman; Chapter 5 in Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen; and Section 5.2 in Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke. For a much more mathematical treatment, see Chapter 13 in Lectures on the Poisson Process by Last and Penrose.

For this particularly Cox point process, see the two aforementioned preprints, located here and here.

## Simulating a Thomas cluster point process

Sometimes with just a little tweaking of a point process, you can get a new  point process. An example of this is the Thomas point process, which is a type of cluster point process, meaning that its randomly located points tend to form random clusters.  This point process is an example of a family of cluster point processes known as Neyman-Scott point processes, which have been used as models in spatial statistics and telecommunications. If that sounds familiar, that is because this point process is very similar to the Matérn point cluster process, which I covered in the previous post

The only difference between the two point processes is how the points are randomly located. In each cluster of a Thomas point process, each individual point is located according to two independent zero-mean normal variables with variance $$\sigma^2$$, describing the $$x$$ and $$y$$ coordinates relative to the cluster centre, whereas each point of a Matérn point process is located uniformly in a disk.

Working in polar coordinates, an equivalent way to simulate a Thomas point process is to use independent and identically-distributed Rayleigh random variables for the radial (or $$\rho$$) coordinates, instead of using random variables with a triangular distribution, which are used to simulate the Matérn point process. This method works because in polar coordinates a uniform random variable for the angular (or $$\theta$$ ) coordinate and a Rayleigh random variable for the angular (or $$\rho$$) is equivalent to in Cartesian coordinates two independent zero-mean normal variables. This is exactly the trick behind the Box-Muller transform for generating normal random variables using just uniform random variables.

If you’re familiar with simulating the Matérn point process, you can skip the next couple sections, and just read how the points are located.

## Overview

Simulating a Thomas cluster point process requires first simulating a homogeneous Poisson point process with intensity $$\lambda>0$$ on some simulation window, such as a rectangle, which is the simulation window I will use here. Then for each point of this underlying point process, simulate a Poisson number of points with mean $$\mu>0$$, and for each point simulate two independent zero-mean normal variables with variance $$\sigma^2$$, corresponding to the (relative) Cartesian coordinates .

The underlying  point process is sometimes called the parent (point) process, and its points are centres of the cluster disks. The subsequent point process on all the disks is called daughter (point) process and it forms the clusters. I have already written about simulating the homogeneous Poisson point processes on a rectangle and a disk, so  those posts are good starting points, and I will not focus  too much on details for these steps steps.

Importantly, like the Matérn point process, it’s possible for daughter points to appear in the simulation window that come from parents points outside the simulation window. To handle these edge effects, the point processes must be first simulated on an extended version of the simulation window. Then only the daughter points within the simulation window are kept and the rest are removed.

We can add a strip of some width $$d$$ all around the simulation window. But what value does $$d$$ take? Well, in theory, it is possible that a daughter point comes from a parent point that is very far from the simulation window. But that probability becomes vanishingly small as the distance increases, due to the daughter points being located according to zero-mean normal random variables.

For example, if a single parent is at a distance $$d=6 \sigma$$ from the simulation, then there is about a $1/1 000 000 000$ chance that a single daughter point will land in the simulation window. The probability is simply $$1-\Phi(6 \sigma)$$, where $$\Phi$$ is the cumulative distribution function of a normal variable with zero mean and standard deviation $$\sigma>0$$.  This is what they call a six sigma event.  In my code, I set $$d=6 \sigma$$, but $$d=4 \sigma$$ is good enough, which is the value that the R library spatstat uses by default.

Due to this approximation, this simulation cannot be called a perfect simulation, despite the approximation being highly accurate. In practice, it will not have no measurable effect on simulation results, as the number of simulations will rarely be high enough for (hypothetical) daughter points to come from (hypothetical) parent points outside the window.

## Steps

##### Number of points

Simulate the underlying or parent Poisson point process on the rectangle with $$N_P$$ points. Then for each point, simulate a Poisson number of points, where each disk now has $$D_i$$ number of points. Then the total number of points is simply $$N=D_1+\dots +D_{P}=\sum_{i=1}^{N_P}D_i$$. The random variables $$P$$ and $$D_i$$ are Poisson random variables with respective means $$\lambda A$$ and $$\mu$$, where $$A$$ is the area of the rectangular simulation window. To simulate these random variables in MATLAB, use the poissrnd function. To do this in R, use the  standard function rpois. In Python, we can use either functions scipy.stats.poisson or numpy.random.poisson from the SciPy or NumPy libraries.

##### Locations of points

The points of the parent point process are randomly positioned by using Cartesian coordinates. For a homogeneous Poisson point process, the $$x$$ and $$y$$ coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in an earlier post.

As mentioned in the introduction of this post, the points of all the daughter point process are randomly positioned by either using polar  coordinates or Cartesian coordinates, due to the Box-Muller transform. But because we ultimately convert back to Cartesian coordinates (for example, to plot the points), we will work entirely in this coordinate system.  Each point is then simply positioned with two independent zero-mean normal random variables, representing the $$x$$ and $$y$$ coordinates relative to the original parent point.

##### Shifting all the points in each cluster disk

In practice (that is, in the code), all the daughter points are simulated relative to the origin. Then for each cluster disk, all the points need to be shifted, so the origin coincides with the parent point, which completes the simulation step.

To use vectorization in the code,  the coordinates of each cluster point are repeated by the number of daughters in the corresponding cluster by using the functions repelem in MATLAB, rep in R, and repeat in Python.

## Code

I have implemented the simulation procedure in MATLAB, R and Python, which as usual are all very similar. The code can be downloaded here.

##### MATLAB
 %Simulation window parameters xMin=-.5; xMax=.5; yMin=-.5; yMax=.5; %Parameters for the parent and daughter point processes lambdaParent=10;%density of parent Poisson point process lambdaDautgher=100;%mean number of points in each cluster sigma=0.05;%sigma for normal variables (ie random locations) of daughters %Extended simulation windows parameters rExt=7*sigma; %extension parameter -- use factor of deviation %for rExt, use factor of deviation sigma eg 6 or 7 xMinExt=xMin-rExt; xMaxExt=xMax+rExt; yMinExt=yMin-rExt; yMaxExt=yMax+rExt; %rectangle dimensions xDeltaExt=xMaxExt-xMinExt; yDeltaExt=yMaxExt-yMinExt; areaTotalExt=xDeltaExt*yDeltaExt; %area of extended rectangle %Simulate Poisson point process for the parents numbPointsParent=poissrnd(areaTotalExt*lambdaParent,1,1);%Poisson number %x and y coordinates of Poisson points for the parent xxParent=xMinExt+xDeltaExt*rand(numbPointsParent,1); yyParent=yMinExt+yDeltaExt*rand(numbPointsParent,1); %Simulate Poisson point process for the daughters (ie final poiint process) numbPointsDaughter=poissrnd(lambdaDautgher,numbPointsParent,1); numbPoints=sum(numbPointsDaughter); %total number of points %Generate the (relative) locations in Cartesian coordinates by %simulating independent normal variables xx0=normrnd(0,sigma,numbPoints,1); yy0=normrnd(0,sigma,numbPoints,1); %replicate parent points (ie centres of disks/clusters) xx=repelem(xxParent,numbPointsDaughter); yy=repelem(yyParent,numbPointsDaughter); %translate points (ie parents points are the centres of cluster disks) xx=xx(:)+xx0; yy=yy(:)+yy0; %thin points if outside the simulation window booleInside=((xx&gt;=xMin)&amp;(xx&lt;=xMax)&amp;(yy&gt;=yMin)&amp;(yy&lt;=yMax)); %retain points inside simulation window xx=xx(booleInside); yy=yy(booleInside); %Plotting scatter(xx,yy);
##### R

Note: it is a bit tricky to write “<-” in the R code (as it automatically changes to the html equivalent in the HTML editor I am using), so I have usually used “=” instead of the usual “<-”.

 #Simulation window parameters xMin=-.5; xMax=.5; yMin=-.5; yMax=.5; #Parameters for the parent and daughter point processes lambdaParent=10;#density of parent Poisson point process lambdaDaughter=100;#mean number of points in each cluster sigma=0.05; #sigma for normal variables (ie random locations) of daughters #Extended simulation windows parameters rExt=7*sigma; #extension parameter #for rExt, use factor of deviation sigma eg 6 or 7 xMinExt=xMin-rExt; xMaxExt=xMax+rExt; yMinExt=yMin-rExt; yMaxExt=yMax+rExt; #rectangle dimensions xDeltaExt=xMaxExt-xMinExt; yDeltaExt=yMaxExt-yMinExt; areaTotalExt=xDeltaExt*yDeltaExt; #area of extended rectangle #Simulate Poisson point process for the parents numbPointsParent=rpois(1,areaTotalExt*lambdaParent);#Poisson number of points #x and y coordinates of Poisson points for the parent xxParent=xMinExt+xDeltaExt*runif(numbPointsParent); yyParent=yMinExt+yDeltaExt*runif(numbPointsParent); #Simulate Poisson point process for the daughters (ie final poiint process) numbPointsDaughter=rpois(numbPointsParent,lambdaDaughter); numbPoints=sum(numbPointsDaughter); #total number of points #Generate the (relative) locations in Cartesian coordinates by #simulating independent normal variables xx0=rnorm(numbPoints,0,sigma); yy0=rnorm(numbPoints,0,sigma); #replicate parent points (ie centres of disks/clusters) xx=rep(xxParent,numbPointsDaughter); yy=rep(yyParent,numbPointsDaughter); #translate points (ie parents points are the centres of cluster disks) xx=xx+xx0; yy=yy+yy0; #thin points if outside the simulation window booleInside=((xx&gt;=xMin)&amp;(xx&lt;=xMax)&amp;(yy&gt;=yMin)&amp;(yy&lt;=yMax)); #retain points inside simulation window xx=xx[booleInside]; yy=yy[booleInside]; #Plotting par(pty="s") plot(xx,yy,'p',xlab='x',ylab='y',col='blue');

Of course, as I have mentioned before, simulating a spatial point processes in R is even easier with the powerful spatial statistics library spatstat. The Thomas cluster point process is simulated by using the function rThomas, but other cluster point processes, including Neyman-Scott types, are possible.

##### Python

Note: in previous posts I used the SciPy functions for random number generation, but now use the NumPy ones, but there is little difference, as SciPy builds off NumPy.

 import numpy as np;&nbsp; # NumPy package for arrays, random number generation, etc import matplotlib.pyplot as plt&nbsp; # For plotting # Simulation window parameters xMin = -.5; xMax = .5; yMin = -.5; yMax = .5; # Parameters for the parent and daughter point processes lambdaParent = 10; # density of parent Poisson point process lambdaDaughter = 100; # mean number of points in each cluster sigma = 0.05; # sigma for normal variables (ie random locations) of daughters #Extended simulation windows parameters rExt=7*sigma; #extension parameter #for rExt, use factor of deviation sigma eg 6 or 7 xMinExt = xMin - rExt; xMaxExt = xMax + rExt; yMinExt = yMin - rExt; yMaxExt = yMax + rExt; # rectangle dimensions xDeltaExt = xMaxExt - xMinExt; yDeltaExt = yMaxExt - yMinExt; areaTotalExt = xDeltaExt * yDeltaExt; # area of extended rectangle # Simulate Poisson point process for the parents numbPointsParent = np.random.poisson(areaTotalExt * lambdaParent);# Poisson number of points # x and y coordinates of Poisson points for the parent xxParent = xMinExt + xDeltaExt * np.random.uniform(0, 1, numbPointsParent); yyParent = yMinExt + yDeltaExt * np.random.uniform(0, 1, numbPointsParent); # Simulate Poisson point process for the daughters (ie final poiint process) numbPointsDaughter = np.random.poisson(lambdaDaughter, numbPointsParent); numbPoints = sum(numbPointsDaughter); # total number of points # Generate the (relative) locations in Cartesian coordinates by # simulating independent normal variables xx0 = np.random.normal(0, sigma, numbPoints); # (relative) x coordinaets yy0 = np.random.normal(0, sigma, numbPoints); # (relative) y coordinates # replicate parent points (ie centres of disks/clusters) xx = np.repeat(xxParent, numbPointsDaughter); yy = np.repeat(yyParent, numbPointsDaughter); # translate points (ie parents points are the centres of cluster disks) xx = xx + xx0; yy = yy + yy0; # thin points if outside the simulation window booleInside=((xx&gt;=xMin)&amp;(xx&lt;=xMax)&amp;(yy&gt;=yMin)&amp;(yy&lt;=yMax)); # retain points inside simulation window xx = xx[booleInside]; yy = yy[booleInside]; # Plotting plt.scatter(xx, yy, edgecolor='b', facecolor='none', alpha=0.5); plt.xlabel("x"); plt.ylabel("y"); plt.axis('equal');
##### Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.

## Results

The results show that the clusters of Thomas point process tend to be more blurred than those of Matérn point process, which has cluster edges clearly defined by the disks.  The points of of a Thomas point process can be far away from the centre of each cluster, depending on the variance of the normal random variables used in the simulation.

##### Python

Thomas (and Matérn) cluster point processes and more generally Neyman-Scott point processes are covered in standard books on the related fields of spatial statistics, point processes and stochastic geometry, such as the following: Spatial Point Patterns: Methodology and Applications with R by  Baddeley, Rubak and Turner, on page 461; Statistical Analysis and Modelling of Spatial Point Patterns Statistics by Illian, Penttinen, Stoyan, amd Stoyan, page 370 and Section 6.3.2; Stochastic Geometry and its Applications by Chiu, Stoyan, Kendall and Mecke,  on page 173; and; Statistical Inference and Simulation for Spatial Point Processes by Møller and Waagepetersen, in Section 5.3. I would probably recommend the first two books for beginners.

The Thomas point process has also appeared in models of wireless networks, which we covered in the book Stochastic Geometry Analysis of Cellular Networks by Błaszczyszyn, Haenggi, Keeler, and Mukherjee, Section 8.1.8.

More generally, Neyman-Scott point processes belong to a family of point processes called shot noise Cox point processes; see the paper by Møller.

There is the book Spatial Point Patterns: Methodology and Applications with R written by spatial statistics experts Baddeley, Rubak and Turner, which covers the spatial statistics (and point process simulation) R-package spatstat.

In my travels on the web, I found this post where the writer also simulates the Thomas and Matérn point processes in Python, independent of my code.  That code is a bit different to mine because I use the repeat function and simulate all the Poisson variables at once, instead of using a for-loop and simulating a Poisson variable for each iteration.  I also think it’s not quite correct because I do not see how they account for edge effects.

## Simulating a Matérn cluster point process

A Matérn cluster point process is a type of cluster point process, meaning that its randomly located points tend to form random clusters. (I skip the details here, but by using techniques from spatial statistics, it is possible to make the definition of clustering more precise.) This point process is an example of a family of cluster point processes known as Neyman-Scott point processes, which have been used in spatial statistics and telecommunications.

I should point out that the Matérn cluster point process should not be confused with the Matérn hard-core point process, which is a completely different type of point process. (For a research article, I have actually written code in MATLAB that simulates this type of point process.) Bertril Matérn proposed at least four types of point processes, and his name also refers to a specific type of covariance function used to define Gaussian processes.

## Overview

Simulating a Matérn cluster point process requires first simulating  a homogeneous Poisson point process with an intensity $$\lambda>0$$ on some simulation window, such as a rectangle, which is the simulation window I will use here. Then for each point of this underlying point process, simulate a Poisson number of points with mean $$\mu>0$$ uniformly on a disk with a constant radius $$r>0$$. The underlying  point process is sometimes called the parent (point) process, and its points are centres of the cluster disks.

The subsequent point process on all the disks is called daughter (point) process and it forms the clusters. I have already written about simulating the homogeneous Poisson point processes on a rectangle and a disk, so those posts are good starting points, and I will not focus  too much on details for these steps.

### Edge effects

The main challenge behind sampling this point process, which I originally forgot about in an earlier version of this post, is that it’s possible for daughter points to appear in the simulation window that come from parents points outside the simulation window. In other words, parents points outside the simulation window contribute to points inside the window.

To remove these edge effects, the point processes must be simulated on an extended version of the simulation window. Then only the daughter points within the simulation window are kept and the rest are removed.  Consequently, the points are simulated on an extended window, but we only see the points inside the simulation window.

To create the extended simulation window, we can add a strip of width $$r$$ all around the simulation window. Why? Well, the distance $$r$$ is the maximum distance from the simulation window that a possibly contributing parent point (outside the simulation window) can exist, while still having daughter points inside the simulation window. This means it is impossible for a hypothetical parent point beyond this distance (outside the extended window) to generate a daughter point that can fall inside the simulation window.

## Steps

##### Number of points

Simulate the underlying or parent Poisson point process on the rectangle with $$N_P$$ points. Then for each point, simulate a Poisson number of points, where each disk now has $$D_i$$ number of points. Then the total number of points is simply $$N=D_1+\dots +D_{P}=\sum_{i=1}^{N_P}D_i$$. The random variables $$P$$ and $$D_i$$ are Poisson random variables with respective means $$\lambda A$$ and $$\mu$$, where $$A$$ is the area of the rectangular simulation window. To simulate these random variables in MATLAB, use the poissrnd function. To do this in R, use the  standard function rpois. In Python, we can use either functions scipy.stats.poisson or numpy.random.poisson from the SciPy or NumPy libraries.

##### Locations of points

The points of the parent point process are randomly positioned by using Cartesian coordinates. For a homogeneous Poisson point process, the $$x$$ and $$y$$ coordinates of each point are independent uniform points, which is also the case for the binomial point process, covered in a previous post. The points of all the daughter point process are randomly positioned by using polar  coordinates. For a homogeneous Poisson point process, the $$\theta$$ and $$\rho$$ coordinates of each point are independent  variables,  respectively with uniform and triangle distributions, which was covered in a previous post. Then we  convert coordinates back to Cartesian form, which is easily done in MATLAB with the pol2cart function. In languages without such a function: $$x=\rho\cos(\theta)$$ and $$y=\rho\sin(\theta)$$.

##### Shifting all the points in each cluster disk

In practice (that is, in the code), all the daughter points are simulated in a disk with its centre at the origin. Then for each cluster disk, all the points have to be shifted to the origin is the center of the cluster, which completes the simulation step.

To use vectorization in the code,  the coordinates of each cluster point are repeated by the number of daughters in the corresponding cluster by using the functions repelem in MATLAB, rep in R, and repeat in Python.

## Code

I’ll now give some code in MATLAB, R and Python, which you can see are all very similar. It’s also located here.

##### MATLAB

The MATLAB code is located here.

#### R

The R code is located here.

Of course, as I have mentioned before, simulating a spatial point processes in R is even easier with the powerful spatial statistics library spatstat. The Matérn cluster point process is simulated by using the function rMatClust, but other cluster point processes, including Neyman-Scott types, are possible.

##### Python

The Pyyhon code is located here.

Note: in previous posts I used the SciPy functions for random number generation, but now use the NumPy ones, but there is little difference, as SciPy builds off NumPy.

##### Julia

After writing this post, I later wrote the code in Julia. The code is here and my thoughts about Julia are here.