The Hammersley-Clifford theorem

In probabilistic modeling, graphs are super popular because we humans think a lot about pair interactions. If X,Y,Z are random variables, then for example, the small graph X - Y - Z expresses that X,Z are “more independent” than Y, Z because the latter are connected by an edge.

From each graph one can make a so-called graphical model in two ways:

  1. Disconnectedness gives independence: Whenever two variables (nodes) are not connected by an edge, they are required to be conditionally independent given the remaining nodes. So X is independent of Z given Y in the above example. These are called the (pairwise) Markov conditions of the graph. A model can be satisfied as “all distributions that satisfy all pairwise Markov conditions of the graph”
  2. Connectedness gives dependence: Consider all distributions that arise as products of functions, one for each clieque of the graph and depending only on the variables in the clique. In the above example, consider all distributions that are products of two functions (for the two edges) one depending only on X,Y and another depending only on Y,Z.

This looks like statistical modeling, but this dual description is deeply rooted in geometry, where we can represent objects either as images of maps, or as solutions of equations. For example, linear subspaces are both images of matrices and kernels of matrices. Representation 1 is as solutions of (polynomial) equations, while representation 2 is a parametrization, i.e., image of a map.

From general theory it follows that a distribution given by parameters (i.e. 2 above) always satisfies all the Markov conditions. Now, the Hammersley-Clifford theorem states that if a distribution is strictly positive, the other implication also holds. If it satisfies the Markov conditions, it factorizes. This is useful not only in statistics, but also in statistical physics, e.g., for models of magnets.

Last week, Seth Sullivant and I posted a new preprint in which we discuss a specific class of distributions which are not strictly positive but still satisfy the Hammersley-Clifford theorem. Their supports (the events with non-zero probability) need to form a natural distributive lattice, a notion from order theory that at first seems entirely unrelated. But who knows what is coincidence, and what has a deeper reason? We will see.

First posted in German on Eigenraum channels: Mastodon, Bluesky