1. Introduction

Published: J. Stat. Mech. (2024) 093209

The development of quantum theory began with the discovery that energy radiating from a body at thermal equilibrium is not distributed among frequencies ($f$) as expected from (classical) statistical mechanics [Planck, 1901][Mehra, 2000][Gorroochurn, 2018]. The only ways found to derive the experimentally-observed distribution involved assuming that either radiation itself, or the energy of an emitter of thermal radiation, was quantized into indivisible amounts $h f$, where $h\approx 6.6e-34 m²kg/s$ became known as Planck's constant [Planck, 1901]. The distribution of energy among frequencies that this quantization implies became known as the Bose-Einstein distribution, in recognition of the refinement and extension of Planck's work by Bose and Einstein [Bose, 1924][Einstein, 1924][Einstein, 1925].

The discrepancy between the observed spectrum of a hot object and the expected one implied that the expectation was wrong. Planck's recognition that it could be resolved by assuming that light emitters have quantized energies [Planck, 1901] led Einstein to the conclusion that the energy of light itself is quantized [Einstein, 1905]. Light quanta later became known as photons [Lewis, 1926]. Here I show that the discrepancy can be resolved without concluding that either light itself, or emitters of light, have quantized energies. It can be resolved by assuming the existence of a universal lower bound on the precision to which the instantaneous microstate of any classically-evolving degree of freedom can be measured or known.

The Bose-Einstein distribution is generally regarded as among the most significant deviations of quantum physics from classical physics, and among the characteristics by which bosons differ from fermions. However the derivation of it presented here implies that any sufficiently-cold continuously-evolving classical dynamical system would be observed to obey Bose-Einstein statistics if the information provided by observations and measurements of it was limited by an uncertainty principle of the form $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}>{h_?}>0$, where $\Delta_{\mathrm{Q}}$ and $\Delta_{\mathrm{P}}$ are the uncertainties in the values of the canonically-conjugate variables that specify a microstate of a single degree of freedom. \vspace{-0.5cm}

1.1 Assumptions

Consider an arbitrary isolated continuously-evolving deterministic dynamical system, and let the instantaneous microstate of one of its degrees of freedom (DOF) be specified by a coordinate $Q_t\in\mathbb{Q}$ and the coordinate's conjugate momentum, $P_t\in\mathbb{P}$, where $\mathbb{Q}\cong\mathbb{R}$ and $\mathbb{P}\cong\mathbb{R}$ are the spaces of all possible coordinates and momenta, respectively.

Since the precision of any measurement is finite, the location of the point $\Gamma_t\equiv(Q_t,P_t)$ in the DOF's phase space, $\mathbb{G}\equiv\mathbb{Q}\times\mathbb{P}\cong\mathbb{R}\times\mathbb{R}$, cannot be known to infinite precision. There is always some degree of uncertainty in the values of $Q_t$ and $P_t$. Therefore the only state of certain knowledge, as distinct from probabilistic knowledge, that an observer could possess about the point $\Gamma_t$ is that it is somewhere in a specified finite-area subset of $\mathbb{G}$.

For simplicity, and because only the most accurate and precise measurements of the microstate are relevant to this work, let us assume that all measurements of $Q_t$ and $P_t$ result in the identifications of interval subsets of $\mathbb{Q}$ and $\mathbb{P}$, respectively, which are certain to contain them. Then all states of sufficiently-high certain knowledge about the location of $\Gamma_t$ in $\mathbb{G}$ can be communicated as four values, $Q$, $P$, $\Delta_{\mathrm{Q}}$, and $\Delta_{\mathrm{P}}$, which specify a rectangular subset of $\mathbb{G}$ with vertices $(Q\pm\Delta_{\mathrm{Q}}/2,P\pm\Delta_{\mathrm{P}}/2)$ that is known to contain $(Q_t,P_t)$.

The precisions, $\Delta_{\mathrm{Q}}$ and $\Delta_{\mathrm{P}}$, to which $Q_t$ and $P_t$ can be determined depend in part on the microstate of the dynamical system at the time of measurement, and in part on how the measurement is performed.

1.1.1 Uncertainty principle

The unavoidably perturbative nature of the act of observation, and the fact that it is impossible for an observer to possess an infinite amount of information, imply that $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}} >0$, but does not necessarily imply that the finite value of $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}$ cannot be arbitrarily small.

However a finite universe contains a finite amount of information. Therefore a lower bound on the value of $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}$ must exist if the universe is finite. For example, it is safe to say that the values of $\Delta_{\mathrm{Q}}$ and $\Delta_{\mathrm{P}}$ cannot be smaller than $10^{-N_p}$ in SI units, where $N_p$ is the number of particles in the universe. Therefore it is safe to say that there never has been, and never will be, a measurement of a DOF's microstate which determined the location of the microstate in its phase space to within an area of less than $10^{-2N_p}$ in SI units.

This extreme example demonstrates that $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}$ would be bounded from below in a finite universe, and its extremeness illustrates that larger universal lower bounds on microstate precision must also exist (e.g., $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}> 10^{-N_p}$ in SI units). Only the largest of all universal lower bounds would be relevant to this work.

In an infinite universe, it is not immediately obvious that $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}$ cannot be arbitrarily small. However, uncertainty principles of the form $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}>{h_?}$ arise in many contexts, and these uncertainty principles do not need to be universal (applicable in every possible context) for the derivation presented in this work to imply that, at low $T$, the distribution of a classical system's energy would appear to have the Bose-Einstein form in the context in which a particular uncertainty principle applies and is inviolable.

For example, Ref. [Tangney, 2024] examines the relationship between macrostructure and microstructure, where macrostructure is the homogenized form that a microscopically-fluctuating classical field (the microstructure) is observed to have on a much larger time and/or length scale (the macroscale). It is shown that an uncertainty principle of the form $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}>{h_?}$ applies at the macroscale when the probe used for all measurements is a macroscopic field (i.e., a homogenized microscopic field). Therefore if the direct or indirect source of all empirical knowledge was measurements with a macroscopic field, the energies of all sufficiently-cold classical systems would appear to be Bose-Einstein distributed.

Another example would be a dynamical system that was immersed in a bounded elastic medium. The wavelengths and frequencies ($f$) of classical waves in a bounded uniform medium are quantized, and the energy of a wave of amplitude $A$ can be expressed as $\frac{1}{2}\gamma A^2 f^2$, for some medium-dependent constant $\gamma$. If $\Delta f$ was the frequency quantum, the smallest energy difference between two waves, one of whose frequencies was $f$, would be

\[\begin{aligned}\Delta\mathscr{E} &= \frac{1}{2}\gamma A^2\left(f+\Delta f\right)^2-\frac{1}{2}\gamma A^2f^2 = \left(\gamma A^2\Delta f\right) f + \mathcal{O}{\Delta f^2}.\end{aligned}\]

Therefore if all of an observer's knowledge about the immersed object had been communicated to them via the medium's waves, the smallest change in the energy of the object that could be communicated to them by observing the change in energy of a wave of frequency $f$ would be $h_m f$, where $h_m=\gamma A^2 \Delta f$.

I base the otherwise-classical derivation presented in this work on the following nonstandard and strong assumption: There exists a finite lower bound ${h_?}$ on the value of $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}$, and the same lower bound on microstate measurement precision applies to every observer and to every DOF of every classical dynamical system. In other words, I assume the existence of an uncertainty principle, $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}>{h_?}>0$, that is universal, meaning valid in every possible context. However, as discussed, if an uncertainty principle has a restricted validity, the derivation shares the uncertainty principle's domain of validity.

For the purposes of this work I will assume that all measurements of a DOF's microstate are performed at the lower bound on microstate precision, $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}={h_?}$. Therefore each measurement of $\Gamma_t$ reveals that it is in a rectangle of area ${h_?}$, centered at a point $\Gamma$, whose sides are parallel to the $\mathbb{Q}$ and $\mathbb{P}$ axes. I will denote such a rectangle by $\mathfrak{R}(\Gamma,\mathfrak{r})$, where $\mathfrak{r}\equiv\Delta_{\mathrm{Q}}/\Delta_{\mathrm{P}}=\Delta_{\mathrm{Q}}^2/{h_?}={h_?}/\Delta_{\mathrm{P}}^2$.

I will use the assumption $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}>{h_?}>0$ to prove that, at thermal equilibrium, the distribution of any classical dynamical system's energy among its DOFs is a Bose-Einstein distribution in the low temperature ($T$) limit, albeit with ${h_?}$ in place of Planck's constant, $h$.

1.1.2 Low temperature limit

The low $T$ limit is the limit in which Bose-Einstein statistics apply within quantum mechanics. Both classically and quantum mechanically, the low $T$ limit is the weakly-interacting limit, and the Bose-Einstein distribution cannot be derived without assuming that interactions are weak enough to be approximated as absent for some purposes.

However it is important to clarify that assuming that an isolated physical system is in the low $T$ limit means assuming that interactions are arbitrarily weak, but finite. It does not mean assuming that interactions are absent. This distinction is important because there would not be any energy exchange between DOFs if interactions were absent. Therefore a state of thermal equilibrium could never be reached, and it would not be meaningful to speak of the physical system having a temperature.

Let us assume that each DOF's pair of canonically-conjugate phase space coordinates, $(Q_\eta,P_\eta)$, has been chosen such that, at any temperature, the Hamiltonian can be expressed exactly as

\[\begin{aligned}\mathcal{H}\left(\{(Q_\eta,P_\eta)\}\right) = U\left(\{Q_\eta\}\right) + \sum_\eta \mathcal{K}_\eta\left(P_\eta\right),\end{aligned}\]

where $U$ is the potential energy and $\mathcal{K}_\eta$ is the kinetic energy of DOF $\eta$.

Cooling the system brings its set of coordinates, $\{Q_\eta\}$, closer to a local minimum of $U$. Therefore, by reducing $T$, $\{Q_\eta\}$ can be brought arbitrarily close to a set, $\{Q^{\text{min}}_\eta\}$, at which the partial derivative $\frac{\partial U}{\partial Q_\eta}$ vanishes for every $\eta$, and the second partial derivatives $\frac{\partial^{2} U}{\partial Q_\eta^{2}}{Q_\mu}$ are all either zero or positive. Furthermore, it is always possible to choose the set $\{Q_\eta\}$ such that the mixed derivatives, $\frac{\partial^{2} U}{\partial Q_\eta^{2}}{Q_{\mu\neq \eta}}$, vanish at $\{Q_\eta\}=\{Q_\eta^{\text{min}}\}$. Therefore it is always possible to express the potential energy as

\[\begin{aligned}U= U^{\text{min}} + \frac{1}{2}\sum_{\eta}\frac{\partial^{2} U}{\partial Q_\eta^{2}}\Big|_{\{Q_\eta^{\text{min}}\}}\Delta Q_\eta^2 + \mathcal{O}{\Delta Q^3},\end{aligned}\]

where $U^{\text{min}}\equiv U(\{Q^{\text{min}}_\eta\})$ and $\Delta Q_\eta\equiv Q_\eta-Q_\eta^{\text{min}}$.

Reducing $T$ reduces the thermal averages of the $\Delta Q_\eta$'s and the standard deviation of their fluctuations. Therefore, by cooling to a sufficiently low $T$, the terms of orders $\Delta Q^3$ and higher can be made negligible. This means that, by reducing $T$, the potential energy can be approximated arbitrarily closely as $U\approx U^{\text{min}}+\sum_\eta U_\eta$, where $U_\eta=U_\eta(Q_\eta)\propto\Delta Q_\eta^2$. Therefore assuming that a physical system is in the low $T$ limit allows the Hamiltonian to be approximated as

\[\begin{aligned}\mathcal{H}\approx U^{\text{min}}+\sum_\eta \mathcal{H}_\eta(Q_\eta,P_\eta), \end{aligned}\]

where $\mathcal{H}_\eta(Q_\eta,P_\eta)\equiv U_\eta(Q_\eta)+\mathcal{K}_\eta(P_\eta)$.

Since none of the terms on the right hand side of Eq. depend on the phase space coordinates of more than one DOF, if the derivatives ${\frac{\partial^{2} U}{\partial Q_\eta^{2}}}\Big|_{\{Q_\eta^{\text{min}}\}}$ are positive the DOFs only exchange energy through the neglected terms in the potential energy. These terms can be made arbitrarily small by reducing $T$, so interactions between DOFs can be made arbitrarily weak by reducing $T$.

If the derivatives $\frac{\partial^{2} U}{\partial Q_\eta^{2}}$ are all zero, then each $\mathcal{H}_\eta$ is independent of $Q_\eta$, meaning that the system is gaseous. Therefore the DOFs only exchange energy during rare and brief "collisions," i.e., when the constant rates of change of two or more coordinates bring the set $\{Q_\eta\}$ into a region of the configuration space $\mathbb{Q}$ where $U$ is not independent of the $Q_\eta$'s. When that happens, the coordinates either condense into a set of weakly-interacting oscillators, or cease interacting again. If they cease interacting, their kinetic energies after the collision differ, in general, from their kinetic energies before the collision. If the duration of each collision is comparable to the time between collisions, $T$ can be reduced until the former is a negligible fraction of the latter.

Regardless of whether a DOF becomes part of a set of weakly-interacting oscillators in the T\to 0 limit, or becomes an independent entity with constant potential energy in that limit, the assumption of a state of thermal equilibrium implies that energy is exchanged - either slowly or rarely. Therefore the equipartition theorem applies, which means that the time average of each DOF's energy is $\frac{1}{2}k_B T$, where $k_B$ is Boltzmann's constant.

1.2 Outline of the derivation

The uncertainty principle is the only non-standard assumption that I make to show that the energy of every classical dynamical system is Bose-Einstein distributed in the T\to 0 limit.

To derive this result, I take an information theoretical approach to statistical mechanics that is very similar to the one introduced, or championed, by Jaynes [Jaynes, 1957a][Jaynes, 1957b]. Jaynes' approach leans heavily on the work of Shannon [Shannon, 1948].

There are three important steps in the derivation. The first step, which I discuss in detail in Sec. 2, is to recognize that, in the presence of uncertainty, the only empirically-unfalsifiable theories are statistical theories, and that the only empirically-unfalsifiable statistical theories are those in which uncertainty is maximised subject to the constraint that everything that is known about the system is true. I refer to the set of all known information pertaining to a physical system as the system's macrostate.

The second step, which I discuss in detail in Sec. 3, is to recognize that when an uncertainty principle applies, the domains of empirically testable probability distributions are quantized.

The third step is to transform the coordinates $(Q_\eta,P_\eta)$ canonically, such that $\mathcal{H}_\eta$ is transformed to a Hamiltonian with a particular form.

I will now outline the third step and explain why the derivation applies to every sufficiently cold classical dynamical system.

1.2.1 Transforming the Hamiltonian of each degree of freedom to an affine form

In Sec. 4 I will show that when the uncertainty principle applies there is no inconsistency between the Bose-Einstein distribution and the Maxwell-Boltzmann distribution for a classical dynamical system: If the system is cold enough, the latter becomes the former under a canonical transformation of the set $\{(Q_\eta,P_\eta)\}$ of all phase space coordinates to a new set $\{(X_\eta,Y_\eta)\}$, which transforms the Hamiltonian of each DOF from $\mathcal{H}_\eta(Q_\eta,P_\eta) = U_\eta(Q_\eta)+\mathcal{K}_\eta(P_\eta)$ to one of the form

\[\begin{aligned}\tilde{\mathcal{H}}=\tilde{\mathcal{H}}_0+\sum_\eta\tilde{\mathcal{H}}_\eta= \tilde{\mathcal{H}}_0+\sum_\eta \left[B_\eta + C_\eta X_\eta\right],\end{aligned}\]

where $\tilde{\mathcal{H}}_0$ is constant, and $B_\eta$, $C_\eta$, and $Y_\eta$ are (approximately) constants of the motion of DOF $\eta$. Such a transformation is possible for every sufficiently-cold classical dynamical system at thermal equilibrium because, as discussed in Sec. 1.1.2, each DOF is either a harmonic oscillator in that limit, or has a constant potential energy almost all of the time in that limit.

If the potential energy is constant, the only energy that the DOF can exchange with other DOFs is its kinetic energy, $\mathcal{H}_\eta=\mathcal{K}_\eta\propto P_\eta^2$. If $Q_\eta$ oscillates harmonically, the energy of DOF $\eta$ is proportional to the square of its oscillation amplitude, i.e., $\mathcal{H}_\eta\propto A_\eta^2$. Therefore $\mathcal{H}_\eta$ has the same mathematical form in each case, and this quadratic function of a single variable can be transformed canonically into an affine function of a single variable $X_\eta$ whose form is $\tilde{\mathcal{H}}_\eta=B_\eta+C_\eta X_\eta$ [Glass, 1977].

For example, by transforming to action-angle coordinates $(Q_\eta,P_\eta)\mapsto(\mathcal{I}_\eta,\theta_\eta)$, the Hamiltonian of a set of harmonic oscillators is transformed from $\mathcal{H}=\frac{1}{2}\sum_\eta \left[P_\eta^2 + \omega_\eta^2 Q_\eta^2\right]$ to $\tilde{\mathcal{H}}=\sum_{\eta}\mathcal{I}_\eta\omega_\eta$, where $\omega_\eta=\dot{\theta}_\eta$ and the action $\mathcal{I}_\eta$ is a constant in the T\to 0 limit of arbitrarily weak interactions [Landau, 1976][Arnold, 1989][Lanczos, 1949].