Beyond the Poisson point process

As great as the Poisson point process is — and it is pretty great — it is sadly not always suitable for mathematical models. The tractability of this point process is due to the independence of the locations of its points. Informally, this means that point locations of a Poisson point process in any region will not affect the probability of finding other points in some other region. But such independence may not be true or even approximately true when trying to develop a mathematical model for certain phenomena.

Clustering and Repulsion

One can quickly think of examples where the Poisson point process is not a suitable model. For example, if a star is part of a galaxy, then it is more likely that another star will be located nearby. Conversely, given the location of a tree in the forest, then usually it is less likely to then find another tree relatively nearby, because trees need a certain amount of land to draw water from the earth. In the language of point processes, we say that the stars tend to show clustering, while the trees tend to show repulsion.

To better model phenomena like like trees and stars, we can use point processes that also exhibit the properties of clustering and repulsion. In fact, a good part of spatial statistics has been dedicated to developing statistical tools for testing if repulsion or clustering exists in observed point patterns, which is the spatial statistics term used for samples of objects that can be represented as points in space. (A point process is a random object, so a single realization or outcome of a point process is an example of a point pattern.)

The Poisson point process lies halfway between these two categories, meaning that its points show an equal degree of clustering and repulsion. Mathematically, this can be made more formal by, for example, using something called factorial moment measures, which are mathematical objects used to study point processes.

For probability applications, Błaszczyszyn and Yogeshwaran developed a framework using factorial moment measures, which allowed them to classify point process into what they called super-Poisson and sub-Poisson, referring respectively to point processes with points that tend to cluster and repel more.

Point Process Operations

If a Poisson point processes is not suitable for certain models, then we need to develop and use other point processes that exhibit clustering or repulsion. Fortunately, one way to develop such point processes is to apply certain point process operations to Poisson and point processes in general. For developing new point processes, researchers have largely studied three types types of point process operations: thinning, superposition, and clustering. (But there are other operations one can apply to a point process such as randomly moving the points.)

Thinning

To apply the thinning operation means to use some rule for selectively removing points from a point process \(\Phi\) to form a new point process \(\Phi_p\). A rule may be purely random such as the rule known as \(p\)-thinning. For this rule, each point of \(\Phi\) is independently removed (or kept) with some probability \(p\) (or \(1-p\)). This thinning method can be likened to looking at each point, flipping a biased coin with probability \(p\) for heads, and removing the point if a head occurs.

This rule may be generalized by introducing a non-negative function \(p(x)\leq 1\), where \(x\) is a point in the space on which the point process is defined. This allows us to define a location-dependent \(p(x)\)-thinning, where now the probability of a point being removed is \(p(x)\) and is dependent on where the point \(x\) of \(\Phi\) is located on the underlying space.

The thinning operation is very useful, and I will write more about it in another post, including some examples implemented in code.

Superposition

The superposition of two or more point processes simply means taking the union of two or more point processes. (Point processes can be considered as random sets, which is why point process notation consists of notation from set theory, as well as other mathematical branches.)

More formally, if there is a countable collection of point processes \(\Phi_1,\Phi_2\dots\), then their superposition
\[
\Phi=\bigcup_{i=1}^{\infty}\Phi_i,
\]
also forms a point process. If the point processes are all independent and Poisson, then the superposition will be another Poisson point process, meaning we have not produced a new point process.

Clustering

Related to superposition is a point operation known as clustering, which entails replacing every point \(x\) in a given point process \(\Phi\) with a cluster of points \(N^x\). Each cluster is also a point process, but with a finite number of points. The union of all the clusters forms a cluster point process, that is
\[
\Phi_c=\bigcup_{x\in \Phi}N^x.
\]

In two previous blogs I have already used this point process operation to construct the Matérn and Thomas (cluster) point processes, which both involve using an underlying Poisson point process. Each point of this point process was assigned a Poisson random number of points, and then the points were uniformly scattered on a disk (for Matérn) or scattered according to a two-dimensional normal distribution (for Thomas). They are members of a family of point processes called Neyman-Scott point processes.

Clustering or repulsion?

I mentioned earlier that in spatial statistics there are statistical tools for testing if clustering or repulsion exists in observed point patterns, usually by comparing it to the Poisson point process, which often serves as a benchmark. For example, in spatial statistics the second factorial moment measure is used for the descriptive statistic called Ripley’s \(K\)-function and its rescaled version, Ripley’s \(L\)-function. Keeping with the alphabetical theme, another example of such a statistic is the \(J\)-function, which was introduced by Van Lieshout and Baddeley.