News Archives – H. Paul Keeler

Learning Theory from First Principles

A new book on statistical or machine learning has been published:

Bach – Learning Theory from First Principles, MIT Press.

The author, Francis Bach, is a well-known researcher at Inria in Paris. There’s a free copy of his book on his website.

In a typical French way, the book is focused on theory, especially the proofs of essential results in statistical learning. But I stress that I’ve only glanced at the book.

Bach has also uploaded the code used in the book, written in both Python and MATLAB, which is something I often do.

Lecture slides based on Baccelli, Błaszczyszyn, and Karray

For those interested in the theory of point processes, Mohamed Karray uploaded some lecture slides, which are based on the book manuscript:

2024 – Baccelli, Błaszczyszyn, and Karray – Random Measures, Point Processes, and Stochastic Geometry.

I have written about this manuscript in a previous post.

Although both the slides and manuscript lean heavily on the side of mathematical rigour, applications are the driving motivation. For example, the third set of slides cover Palm calculus, which hinges upon the important concept of conditioning on a point (of the random point process) existing at a specific location. This idea is frequently used in mathematical models of wireless telecommunication networks based on stochastic geometry. (Unfortunately the Wikipedia article on Palm calculus is sorely lacking and only looks at the temporal, not spatial, setting.)

Karray, a researcher at Orange Labs in Paris, also has slides and (mostly forthcoming) manuscripts on machine learning and statistics, which take a rather rigorous route. It’s a French touch, mathematically, as measure theory appears.

Quantum-enhanced Markov chain Monte Carlo

The not-so-mathematical journal Nature recently published a paper proposing a new Markov chain Monte Carlo method:

2023 – Layden, Mazzola, Mishmash, Motta, Wocjan, Kim, and Sheldon – Quantum-enhanced Markov chain Monte Carlo.

Appearing earlier as this preprint, the paper’s publication in such a journal is a rare event indeed. This post notes this, as well as the fact that we can already simulate perfectly¹ the paper’s test model, the Ising or Potts model.² But this is a quantum algorithm, which is exciting and explains how it can end up in that journal.

The algorithm

The paper’s proposed algorithm adds a quantum mechanical edge or enhancement to the classic Metropolis-Hastings algorithm.³ The original algorithm uses a Markov chain defined on some mathematical space. Running it on a traditional or classical computer, at each time step, the algorithm consists of proposing a random jump and then accepting the proposed jump or not. Owing to the magic of Markov chains, in the long run, the algorithm simulates a desired probability distribution; also see my post.

The new quantum version of the algorithm uses a quantum computer to propose the jump, while still using a classical computer to accept the proposal or not.⁴ The quantum jump proposals are driven by a time-independent Hamiltonian, which is a central object in quantum and, in fact, all physics. This leads to a Boltzmann (or Gibbs) probability distribution for the jumping process.

Then, running the quantum part on a quantum computer, the algorithm will hopefully outperform its classical counterpart. The paper nurtures this hope by giving empirical evidence of the algorithm’s convergence speed. The researchers performed the numerical experiments on a 27-qubit quantum processor at IBM using the platform Qiskit.

Quantum is so hot right now

In recent years researchers have been focusing on such algorithms that exploit the strangeness and spookiness of quantum mechanics. You will see more and more quantum versions of algorithms that appear in statistics, machine learning, and related fields, as suggested by this survey paper, which also appeared in Nature.

Quantum lite

Sometimes quantum mechanics only loosely inspires algorithms and models. In this setting, some of my machine learning work uses determinantal point processes. This kernel-based random model draws direct inspiration from the wave function, a standard object in quantum mechanics. Under suitable simplifying conditions, the model describes the locations of particles known as fermions such as electrons and protons. Still, it’s fascinating that a quantum physics model inspired an interesting random object that has found applications in spatial statistics and machine learning.

Connectivity in device-to-device networks in Poisson-Voronoi cities

Here’s a recently uploaded manuscript:

2023 – Keeler, Błaszczyszyn, Cali – Connectivity and interference in device-to-device networks in Poisson-Voronoi cities.

https://arxiv.org/abs/2309.02137

This work presents numerical results complementing mathematical work carried out by us. The work concerns (continuum) percolation results for a special network model based on Poisson-Voronoi tessellations.

The most relevant work are these two papers (the first being somewhat seminal):

Dousse, Franceschetti, Macris, Meester, Thiran, Percolation in the signal to interference ratio graph, 1996.
Le Gall, Błaszczyszyn, Cali, and En-Najjary, Continuum line-of-sight percolation on Poisson-Voronoi tessellations, 2021

Our work effectively seeks to combine these two papers. We obtain the equivalents results from the first paper by coupling its connectivity model with the connectivity model and network model (based on a Cox point process) presented in the second paper.

If you want a more detailed version, here’s the abstract:

To study the overall connectivity in device-to-device networks in cities, we incorporate a signal-to-interference-plus-noise connectivity model into a Poisson-Voronoi tessellation model representing the streets of a city. Relays are located at crossroads (or street intersections), whereas (user) devices are scattered along streets. Between any two adjacent relays, we assume data can be transmitted either directly between the relays or through users, given they share a common street. Our simulation results reveal that the network connectivity is ensured when the density of users (on the streets) exceeds a certain critical value. But then the network connectivity disappears when the user density exceeds a second critical value. The intuition is that for longer streets, where direct relay-to-relay communication is not possible, users are needed to transmit data between relays, but with too many users the interference becomes too strong, eventually reducing the overall network connectivity. This observation on the user density evokes previous results based on another wireless network model, where transmitter-receivers were scattered across the plane. This effect disappears when interference is removed from the model, giving a variation of the classic Gilbert model and recalling the lesson that neglecting interference in such network models can give overly optimistic results. For physically reasonable model parameters, we show that crowded streets (with more than six users on a typical street) lead to a sudden drop in connectivity. We also give numerical results outlining a relationship between the user density and the strength of any interference reduction techniques.

In future posts I’ll detail the above work as well as our more mathematical work on this type of percolation model.

Book manuscript: Random Measures, Point Processes, and Stochastic Geometry

Interested in the theory of point processes? There is the freely available online book manuscript:

Baccelli, Błaszczyszyn, and Karray – Random Measures, Point Processes, and Stochastic Geometry.

The authors are established names in the field having written many papers on using point processes to model wireless networks.

(Disclaimer: I have worked and published with them. I have even co-authored a book with one of them.)

Of course there are already books on point process, including the two-volume classic by the Daryl Daley and David Vere-Jones. Although that work still serves as a good reference, since its publication researchers have produced many new results. This explains the publication of two more recent books on point processes and their generalization random measures:

Last and Penrose – Lectures on the Poisson process, Cambridge University Press.
Kallenberg – Random Measures, Theory and Applications, Springer.

There’s a free online version of the book by Last and Penrose.

(Disclaimer: Günter Last graciously mentioned my name in the acknowledgements for my very minor contribution to the historical section.)

The manuscript by Baccelli, Błaszczyszyn, and Karray covers newer topics and results by using mathematically rigorous proofs. For example, there are interesting results on determinantal point processes, also called fermion point processes, which started off as a model for subatomic particles in seminal work by Odile Macchi, but they have found applications in wireless network models and machine learning. (I mentioned these later applications in this and this post.)

Despite applications being a strong motivation, the book is not for the faint of hearted in terms of mathematics. In the tradition of French probability, there is plenty of rigour, which means mathematical abstraction, analysis and measure theory.

Update: Mohamed Karray uploaded some lecture slides based on these lectures. Karray is a researcher at Orange Labs in Paris who has research interests in telecommunications (naturally), as well as machine (or statistical) learning.

A Fields Medal goes to another percolation researcher

The Fields Medal is a prize in mathematics awarded every four years to two to four outstanding researchers (forty years old or younger) working in mathematics. One of the medals this year was awarded to French mathematician Hugo Duminil-Copin who has solved problems and obtained new results in the percolation theory which lies in the intersection of probability and statistical physics. Here’s a good Quanta article on Duminil-Copin and some of his work.

(The other winners are June Huh, James Maynard, and Maryna Viazovska.)

The Fields Medal people has been kind to probability researchers in recent years. Previous winners working in probability have included Wendelin Werner (2006), Stanislav Smirov (2010), and Martin Hairer (2014), while other winners in recent years have also made contributions to probability.

All in all, that’s not too shabby for a discipline that for a long, long time wasn’t considered part of mathematics. (That story deserves a post on its own.)

I work nowhere near Duminil-Copin, but I have used some percolation theory in my work. I will write a couple of posts on percolation theory. Eventually, I may even mention some recent work that my collaborators and I have been working on.

News: Accelerated PyTorch learning on Mac

The PyTorch website reveals that I can now do accelerated model training with PyTorch on my Mac (which has one of the new Silicon M series chips):

In collaboration with the Metal engineering team at Apple, we are excited to announce support for GPU-accelerated PyTorch training on Mac. Until now, PyTorch training on Mac only leveraged the CPU, but with the upcoming PyTorch v1.12 release, developers and researchers can take advantage of Apple silicon GPUs for significantly faster model training. This unlocks the ability to perform machine learning workflows like prototyping and fine-tuning locally, right on Mac.

As the website says, this new PyTorch acceleration capability is due to Metal, which is Apple’s graphics processing (or GPU) technology.

This means I no longer need to use remote machines when training models. I can do it all on my trusty home computer. Let’s check.

The easiest way to check is, if you haven’t already, first install PyTorch. I would use Anaconda:

conda install pytorch torchvision -c pytorch

With PyTorch installed, run Python and then run the commands:

import torch
print(torch.backends.mps.is_available()) 
print(torch.backends.mps.is_built())

If the last commands works without a hitch, you have PyTorch installed. The last two commands should return True, meaning that PyTorch can use your graphics card for (accelerated) calculations.

Frozen code

My simulation code has been frozen and buried in Norway. Well, some of my code that I keep on a GitHub repository has become part of a code preservation project. Consequently, beneath my profile it reads:

Arctic Code Vault Contributor

This is part of what is called the GitHub Archive Program. The people behind it aim to preserve large amounts of (open source) code for future generations in thousands and thousands of years time. But how do they do that?

Well, basically, the good people at GitHub chose and converted many, many, many lines of code into certain error-resistant formats, such as QR code. They then printed it all out and buried it deep in an abandoned mine shaft in frozen Norway. (The frozen and stable Norway is also home to a famous seed bank.)

My code in this project includes most of the code that has appeared in these posts. Of course my contribution is just a drop in the vast code ocean of this project. In fact, at least two or three of my colleagues have also had their code put into deep freeze.

Still, it’s a nice thought to know that stuff I wrote, including code for these very posts, will be potentially around for a very long time.

Coverage probability in wireless networks with determinantal scheduling

My collaborators and I uploaded a manuscript:

Błaszczyszyn, Brochard, and Keeler, Coverage probability in wireless networks with determinantal scheduling.

https://arxiv.org/abs/2006.05038

Details

The paper builds off some previous work by us that uses a (relatively) new model in machine learning:

Błaszczyszyn and Keeler, Determinantal thinning of point processes with network learning applications.

https://arxiv.org/abs/1810.08672

The new machine learning model is based on a special type of point process called a determinantal point process. It was originally called a Fermion point process. These are useful point processes as the exhibit certain closure properties under certain operations such independent thinning.

Determinantal point processes are random objects and, when they are properly trained or fitted, can serve as generative artificial intelligence (or Gen AI) models. In fact, you can interpret these point processes as a special Gibbsian point processes, which have at their core a Hamiltonian function. A Hamiltonian is generalized type of energy function, which is also the basis for other generative (or stochastic) models such as Boltzmann machines.

Kulesza and Taskar introduced and developed the framework for using determinantal point processes for machine learning models.

Code

The MATLAB code for the producing the results in the paper can be found here:

https://github.com/hpaulkeeler/detcov_matlab

I also re-wrote the MATLAB code into Python:

https://github.com/hpaulkeeler/detcov_python

Some remarks regarding “On the Laplace Transform of the Aggregate Discounted Claims with Markovian arrivals”

Google Scholar has requested me to make available a freely available copy of this published comment my former PhD supervisor and I wrote a few years ago:

Keeler and Taylor, Some remarks regarding “On the Laplace Transform of the Aggregate Discounted Claims with Markovian arrivals”

OK, Google Scholar, here’s the manuscript that we submitted, which is basically the same as the published version.

The comment and the original paper cover an insurance model (for aggregate claims) that uses a Markov arrival process. Such stochastic processes use matrix theory. But there was a small error in the original paper, where commutativity had been assumed, which is clearly not the case in general for matrices. Despite this error, the incorrect solution gave surprisingly accurate answers, so we investigated why that was.

Nothing groundbreaking here by us. We were just curious.

Category: News