Introduction

“Coarse graining the data” is an intuition floating around quite a lot in the AI safety literature, as we have seen in last weeks paper: “A Bayesian Interpretation of the Internal Model Principle” They discuss the mathematical conditions required for an agent to effectively regulate its environment and argue that for a controller to manage a complex system, it must possess an internal model of the dynamics it seeks to influence. They then mention that this modelling process inherently involves coarse graining. The agent cannot track the full, granular state of the world; instead, it must rely on a compressed representation that partitions the environment into manageable categories.

So, where does this intuition come from? And how is coarse graining (and renormalisation) done in physics? What can that teach us about AI?

Coarse graining as it is used in physics, is a fundamental procedure that is used to link detailed, microscopic descriptions of a physical system to simplified, macroscopic ones. We smooth over or neglecting some of the fine details or degrees of freedom (DOFs) of a system to model its behaviour on larger spatial or temporal scales.

Coarse graining is often used in reduction of one physical theory to another (e.g., statistical mechanics to thermodynamics), or for understanding how new, emergent properties arise in complex systems.

Coarse Graining and Renormalisation in Physics

Coarse graining is a fundamental procedure in physics used to link detailed microscopic descriptions of a system to simplified macroscopic ones. By smoothing over or neglecting fine details and degrees of freedom, we can model physical behaviour on larger spatial or temporal scales. This section outlines how these concepts are applied across different domains. We begin with Statistical Mechanics, looking at how Gibbsian and Boltzmannian frameworks partition phase space to generate irreversibility and the arrow of time. We then examine the Renormalisation Group method, which investigates how system parameters change as the scale of observation varies, particularly near critical points. Following this, we look at Quantum Field Theory, where renormalisation allows us to handle divergent integrals and infer the presence of unknown particles. Finally, we discuss Molecular and Condensed Matter Physics, where coarse graining is used to construct effective theories and potentials that make complex simulations computationally tractable

Entropy

Entropy is extremely important in our conceptual understanding of the world. Our first definitions of entropy stem from thermodynamic theory, which was developed in the time that we were getting used to steam engines. Only later we tried to understand how these macroscopic variables should emerge from microphysics. Nowadays, entropy is derived from statistical mechanics via a coarse graining procedure. This procedure gives us back our thermodynamic definition of entropy, which increases in a closed system, which creates irreversibility. There are however two different ways to find the thermodynamic entropy from the micophysics, namely the Gibbsian and the Boltzmannian frameworks. I will treat them both below.

Gibbsian Coarse Graining

In the Gibbsian framework, the state of the system is a probability density function $\rho(x,t)$ defined over the full phase space $\Gamma$. The coarse-graining procedure is formally a projection operator that maps this detailed probability distribution onto a simplified, "smoothed" distribution.

We partition the continuous phase space $\Gamma$ into a set of discrete, non-overlapping cells (or bins) $\{\omega_i\}$ such that $\bigcup_i \omega_i = \Gamma$. These cells represent the finite resolution of our macroscopic instruments.

The coarse-graining operation transforms the fine-grained density $\rho(x,t)$ into the coarse-grained density $\bar{\rho}(x,t)$. This is mathematically defined as averaging the fine-grained probability over the volume of each cell.

If $V_i$ is the volume of cell $\omega_i$, the coarse-grained density is a piecewise constant function:

$\bar{\rho}(x,t) = \sum_{i} P_i(t) \cdot \mathbb{I}_{\omega_i}(x)$

Where $\mathbb{I}_{\omega_i}(x)$ is the indicator function (1 if $x \in \omega_i$, 0 otherwise), and $P_i(t)$ is the average density within the cell:

$P_i(t) = \frac{1}{V_i} \int_{\omega_i} \rho(x',t) \, dx'$

The fine-grained entropy is $S_{FG} = -k_B \int \rho \ln \rho \, dx$. Due to Liouville's theorem, this value is constant because the probability fluid behaves like an incompressible fluid.

The coarse-grained entropy, however, is calculated using the smoothed $\bar{\rho}$: