5. Bose-Einstein statistics

Published: J. Stat. Mech. (2024) 093209

I will now derive the Bose-Einstein distribution for a classical system of non-interacting oscillators or standing waves. Then I will briefly discuss how the derivation can be generalized to other kinds of physical systems.

Figure 1
Figure 1: (a) A portion of the phase space $\mathbb{G}_{\eta}$ of mode $\eta$. The continuous blue ellipse, $\partial\mathfrak{R}_{\eta}$, is a particular constant-energy path that the oscillation follows when it is decoupled from other modes. The set of pale blue and green spots is a maximal set of mutually-distinguishable microstates, $\Gamma_{\eta}$, of mode $\eta$. In statistical models of the mode's microstates, each spot represents all points within its rectangular neighbourhood. (b) A portion of the microstructure space of modes $\eta$ and $\nu$. The spots belong to a maximal set of mutually distinguishable points and represent the rectangular regions they inhabit. The pale blue spots mark regions visited during the motion of the modes, assuming that their energies, $\mathscr{E}_{\eta}$ and $\mathscr{E}_{\nu}$, are constant and that neither of their frequencies, $\omega_{\eta}$ and $\omega_{\nu}$, is an integer multiple of the other. Each of the $15$ pale blue spots represents the four points $(Q_{\eta},Q_{\nu},P_\eta,P_\nu)= \left(Q_\eta,Q_\nu, \pm\sqrt{2\mathscr{E}_{\eta}-\omega_{\eta}^2Q_{\eta}^2}, \pm\sqrt{2\mathscr{E}_{\nu}-\omega_{\nu}^2Q_{\nu}^2}\right)$ in their joint phase space $\mathbb{G}_{\eta}\times\mathbb{G}_{\nu}$. (c) The pale blue spot is the energy of the trajectory represented by pale blue spots in panels (a) and (b). We cannot calculate the partition function of modes $\eta$ and $\nu$ as $\mathcal{Z}_{\eta}\mathcal{Z}_{\nu} =\sum_{\mathscr{E}_{\eta}}\sum_{\mathscr{E}_{\nu}} e^{-\beta\left(\mathscr{E}_{\eta}+\mathscr{E}_{\nu}\right)}$ if the double summation is over a square grid in $(\mathscr{E}_{\eta},\mathscr{E}_{\nu})$-space. The numbers of energies sampled along each axis are only in the same ratio as the numbers of mutually-distinguishable mode coordinates along each axis in $(Q_{\eta},Q_{\nu})$-space, and the numbers of mutually-distinguishable points in $\mathbb{G}_{\eta}$ and $\mathbb{G}_{\nu}$, if the spacings of sampled values along the mode's energy axes are their frequencies times the same constant.

5.1 Oscillators and standing waves

As discussed in Sec. 1.1.2, if the potential energy of a classical dynamical system is a smooth function $U(\mathbf{Q})$ of its microstructure $\mathbf{Q}$, the system can be brought arbitrarily close to a minimum of its potential energy, $\mathbf{Q}^{\text{min}}$, by cooling it slowly. Once $\norm{\mathbf{Q}-\mathbf{Q}^{\text{min}}}$ is small enough, lowering its temperature further brings its dynamics closer to a superposition of harmonic oscillations of the normal modes of its stable equilibrium structure, $\mathbf{Q}^{\text{min}}$. For example, a set of mutually-attractive particles would condense into a stable vibrating cluster when cooled. The normal modes of a finite crystal or a continuous bounded medium are standing waves, so their dynamics become superpositions of standing waves when they are cold enough.

If we specify the microstructure by the set of displacements from mechanical equilibrium along the normal mode eigenvectors, each DOF $\eta$ is an oscillator or standing wave with a different angular frequency $\omega_\eta$, in general, whose energy can be expressed as

\[\begin{aligned}\mathscr{E}_\eta\equiv\frac{1}{2}(\dot{Q}_\eta^2 + \omega_\eta^2 Q_\eta^2), \end{aligned}\]
where the mode coordinate $Q_\eta$ has the dimensions of $\text{distance}\times\sqrt{\text{mass}}$. In the limit $T\to 0$ the behaviour of the system is described perfectly by a Hamiltonian of the form
\[\begin{aligned}\mathcal{H}(\mathbf{Q},\mathbf{P}) & = U(\mathbf{Q}^{\text{min}}) + \frac{1}{2}\sum_{\eta}\left[P_\eta^2 + \omega_\eta^2 Q_\eta^2\right], \end{aligned}\]
where $U(\mathbf{Q}^{\text{min}})$ is a constant that is irrelevant to the dynamics, and $P_\eta\equiv\dot{Q}_\eta$ is the momentum conjugate to $Q_\eta$.

As illustrated in Fig. 1, the true path $\partial\mathfrak{R}_\eta$ of mode $\eta$ in its phase space $\mathbb{G}_{\eta}$ is continuous. It is only the accessible information about the path that is quantized. As discussed in Sec. 4.2 and at the beginning of Sec. 5, each point $\Gamma_\eta\in\partial\mathfrak{R}_\eta$ is indistinguishable from all points within a neighbourhood of it, whose area is ${h_?}$.

Uncertainty manifests differently in the microstate probability distribution depending on which set of coordinates is used to specify the microstate. Having found that $p$ is a Maxwell-Boltzmann distribution when standard position and momentum coordinates $(P_{\eta},Q_{\eta})$ are used, let us now perform the canonical transformation $(Q_{\eta},P_{\eta})\mapsto(\mathcal{I}_\eta,\vartheta_{\eta})$, where $(\mathcal{I}_\eta,\vartheta_{\eta})$ are the action-angle variables [Landau, 1976][Arnold, 1989][Lanczos, 1949]. Then we will deduce the form of $p$ when the microstate is specfied as $\mathbf{\Gamma}=(\mathbf{I},\boldsymbol{\vartheta})\equiv\left(\mathcal{I}_{1},\mathcal{I}_{2},\cdots,\vartheta_{1},\vartheta_{2},\cdots\right)$.

The action variable is defined as

\[\begin{aligned}\mathcal{I}_{\eta} \equiv \frac{1}{2\pi}\oint_{\partial\mathfrak{R}_\eta}P_{\eta}\dd{Q_{\eta}} = \frac{1}{2\pi}\int\int_{\mathfrak{R}_\eta} \dd{P_{\eta}}\wedge\dd{Q_{\eta}},\end{aligned}\]
where the first integral is performed around the closed continuous trajectory $\partial\mathfrak{R}_\eta$ defined by the equation $\mathcal{H}_{\eta}(Q_{\eta},P_{\eta})=\mathscr{E}_{\eta}$ and depicted in Fig. 1(a). The second expression, which involves an integral over the region $\mathfrak{R}_\eta$ enclosed by the elliptical path $\partial\mathfrak{R}_\eta$, follows from the generalized Stokes theorem.

It follows from the definition of $\mathcal{I}_{\eta}$ that $2\pi\mathcal{I}_{\eta}$ is the area enclosed by $\partial\mathfrak{R}_\eta$. From Eq., it is easy to see that the semi-axes of $\partial\mathfrak{R}_\eta$ are $\sqrt{2\mathscr{E}_{\eta}}/\omega_{\eta}$ and $\sqrt{2\mathscr{E}_{\eta}}$. Therefore, equating two expressions for the area enclosed gives

\[\begin{aligned}2\pi \mathcal{I}_{\eta}= \frac{2\pi }{\omega_{\eta}}\mathscr{E}_{\eta} \implies \mathcal{I}_{\eta} = \frac{\mathscr{E}_{\eta}}{\omega_{\eta}}.\end{aligned}\]
The reason for choosing $\mathcal{I}_{\eta}$ as one of our variables should now be apparent: It allows us to express the new mode Hamiltonian as
\[\begin{aligned}\mathcal{H}_{\eta}(\vartheta_{\eta},\mathcal{I}_{\eta}) = \mathcal{H}_{\eta}(\mathcal{I}_{\eta})=\mathcal{I}_{\eta}\omega_{\eta}.\end{aligned}\]

If we now followed precisely the same procedure with the new coordinates as we used to derive the Maxwell-Boltzmann distribution in Sec. 4.2, we would reach Eq., with $\mathcal{Z}$ and $\mathcal{Z}_{\eta}$ expressed as sums over all $\mathbf{\Gamma}\equiv (\mathbf{I},\boldsymbol{\vartheta})\in\mathcal{G}$ and over all $\Gamma_{\eta}\in\mathcal{G}_{\eta}$, respectively. That is,

\[\begin{aligned}\mathcal{Z} = \sum_{\mathbf{\Gamma}} e^{-\beta\mathcal{H}(\mathbf{\Gamma})}=\sum_{\mathbf{I}}e^{-\beta\mathcal{H}(\mathbf{I})} = \prod_{\eta} \mathcal{Z}_{\eta},\end{aligned}\]
where
\[\begin{aligned}\mathcal{Z}_{\eta}\equiv \sum_{\mathcal{I}_{\eta}}e^{-\beta\mathcal{H}_{\eta}(\mathcal{I}_{\eta})} = \sum_{\mathcal{I}_{\eta}} e^{-\beta \mathcal{I}_{\eta}\omega_{\eta}}, \end{aligned}\]
and the sum over $\mathcal{I}_{\eta}$ is a sum over a maximal set, $\mathcal{G}_{\eta}\equiv \Delta\mathcal{I}_{\eta}\left(\mathbb{Z}^+_0+\frac{1}{2}\right)$, of mutually-distinguishable values of $\mathcal{I}_{\eta}$. The reason for the factor $\frac{1}{2}$ is that the lower bound, $\frac{1}{2}\Delta\mathcal{I}_{\eta}$, on the difference between mutually-distinguishable values of $\mathcal{I}_{\eta}$ makes all points within the interval $[0,\frac{1}{2}\Delta\mathcal{I}_{\eta})$ indistinguishable from zero, and makes zero indistinguishable from all points in this interval. Therefore, the sum in Eq. can be viewed as $1/\Delta\mathcal{I}_{\eta}$ times a Riemann sum over $\mathbb{R}^+$, which samples intervals of width $\Delta\mathcal{I}_{\eta}$ centered at $\frac{1}{2}\Delta\mathcal{I}_{\eta}$, $\frac{3}{2}\Delta\mathcal{I}_{\eta}$, $\frac{5}{2}\Delta\mathcal{I}_{\eta}$, etc..

Now, since ${h_?}$ is a phase space area, the unavoidable uncertainty in the value of $2\pi\mathcal{I}_\eta$ must be ${h_?}$, and the unavoidable uncertainty in the value of $\mathcal{I}_\eta$ must be ${\hbar_?}\equiv {h_?}/(2\pi)$. Therefore the partition function can be expressed as

\[\begin{aligned}\mathcal{Z}_{\eta} & = \sum_{n_{\eta}\in\mathbb{Z}^+_0} e^{-\beta \left(n_{\eta}+\frac{1}{2}\right) {\hbar_?} \omega_{\eta}} \nonumber\\ & = \frac{e^{-\frac{1}{2}\beta{\hbar_?}\omega_{\eta}}}{1-e^{-\beta{\hbar_?}\omega_{\eta}}} = \frac{e^{\frac{1}{2}\beta{\hbar_?}\omega_{\eta}}}{e^{\beta{\hbar_?}\omega_{\eta}}-1},\end{aligned}\]
where the second line has been reached by using the fact that the right hand side of the first line is an infinite geometric series. We can now express the free energy as
\[\begin{aligned}\mathcal{F} & = -\beta^{-1}\log \mathcal{Z} = -\beta^{-1}\sum_{\eta}\log \mathcal{Z}_{\eta} \nonumber\\ & = \sum_{\eta}\left[\frac{1}{2}{\hbar_?}\omega_{\eta} + k_B T\log\left(1-e^{-\beta{\hbar_?}\omega_{\eta}}\right)\right].\end{aligned}\]
The term $\frac{1}{2}{\hbar_?}\omega_{\eta}$ is commonly known as the zero point energy of mode $\eta$.

We can also calculate the expectation value,

\[\begin{aligned}\bar{n}_{\eta} \equiv \mathcal{Z}_{\eta}^{-1}\sum_{n_{\eta}\in\mathbb{Z}^+_0} n_{\eta}e^{-\beta\left(n_{\eta}+\frac{1}{2}\right){\hbar_?}\omega_{\eta}},\end{aligned}\]
of $n_{\eta}$ using Eq. as follows:
\[\begin{aligned}\frac{\partial}{\partial \beta}\left(\sum_{n_{\eta}\in\mathbb{Z}^+_0}e^{-\beta\left(n_{\eta}+\frac{1}{2}\right){\hbar_?}\omega_{\eta}}\right) = \frac{\partial}{\partial \beta}\left(\frac{e^{\frac{1}{2}\beta{\hbar_?}\omega_{\eta}}}{e^{\beta{\hbar_?}\omega_{\eta}}-1}\right).\end{aligned}\]
After taking the derivatives and simplifying, this can be expressed as
\[\begin{aligned}\bar{n}_{\eta} = \frac{1}{e^{\beta{\hbar_?}\omega_{\eta}}-1}.\end{aligned}\]
The integer $n_{\eta}$ is commonly referred to as the occupation number of mode $\eta$ and $\bar{n}_{\eta}$ is its thermal average.

When the modes' amplitudes are large enough that they do interact, their energies and frequencies vary, their paths in their phase spaces are no longer elliptical, and matters become more complicated. Nevertheless, simplifying assumptions are often justified, which allow a Bose-Einstein distribution to be used as the basis for a statistical description of the system's microstates and observables. For example, if the energy of mode $\eta$ is modulated by a mode $\eta'$ whose frequency is sufficiently low ($\omega_{\eta'}\ll\omega_{\eta}$), then $\mathcal{I}_{\eta}$ is approximately adiabatically invariant under this modulation [Landau, 1976][Arnold, 1989][Lanczos, 1949], and the dominant effect of the interaction on mode $\eta$ is to modulate its frequency.

As another example, when the interactions between modes are weak, the distribution of each mode's energy among frequencies is broadened and shifted relative to its T\to 0 limit. Therefore, it still has a well defined mean frequency and mean energy, which allows the Bose-Einstein distribution to be used effectively in many cases.

5.2 Generalizations to non-oscillatory systems

I have now derived the Bose-Einstein distribution for a classical system whose dynamics is a superposition of independent harmonic oscillations. My derivation made use of two properties of the system's Hamiltonian: The first was that it could be expressed as a sum $\mathcal{H}=\sum_\eta\mathcal{H}_\eta$ of the Hamiltonians $\mathcal{H}_\eta$ of independent DOFs. The second was that each $\mathcal{H}_\eta$ could be expressed as an affine function of only one variable. For oscillations, this was achieved by transforming to action-angle variables, so that each $\mathcal{H}_\eta$ took the form $\mathcal{H}_\eta(\mathcal{I}_\eta)=\mathcal{I}_\eta\omega_\eta$. Since variations of $\omega_\eta$ are negligible when interactions are weak, $\mathcal{I}_\eta$ is effectively the only variable that appears in $\mathcal{H}_\eta$.

The Hamiltonians of many other kinds of physical systems, composed of mutually-noninteracting DOFs, can be transformed canonically into forms that allow the Bose-Einstein distribution to be derived. In principle it can be derived whenever there exists a curve $\gamma_\eta:\mathbb{R}^+\to\mathbb{G}_\eta; t\mapsto\gamma_\eta(t)$ in the phase space $\mathbb{G}_\eta$ of each DOF such that energies of DOF $\eta$ are represented on $\gamma_\eta$ in the same proportions as they are represented in $\mathbb{G}_\eta$. To be more precise, energies should be represented on maximal samplings of $\gamma_\eta$ in the same proportions as they are represented on maximal samplings of $\mathbb{G}_\eta$.

Once each $\mathcal{H}_\eta$ has been transformed canonically into the form $\mathcal{H}_\eta(X_\eta)=B_\eta+ C_\eta X_\eta$, where $X_\eta\in\mathbb{R}$ is a continuously-varying generalized coordinate or momentum, and $B_\eta$ and $C_\eta$ are constants, the full Hamiltonian becomes

\[\begin{aligned}\mathcal{H}(\mathbf{X}) &\equiv U(\mathbf{Q}^{\text{min}}) + \sum_{\eta} B_\eta + \sum_{\eta} C_\eta X_\eta \end{aligned}\]
where $\mathbf{X}\equiv(X_1, X_2,\cdots)$. Let $\frac{1}{2}\Delta X_\eta$ denote the smallest difference between mutually-distinguishable values of $X_\eta$; let $\epsilon_\eta\equiv C_\eta\Delta X_\eta$; and let $D \equiv e^{-\beta\left(U(\mathbf{Q}^{\text{min}})+\sum_\eta B_\eta\right)}$. Then the partition function can be expressed as
\[\begin{aligned}\mathcal{Z} &= D\prod_\eta \sum_{n_\eta\in\mathbb{Z}^+_0} e^{-\beta (n_\eta +\frac{1}{2})\epsilon_\eta},\end{aligned}\]
and it is straightforward to show that $\bar{n}_\eta = 1/\left(e^{\beta\epsilon_\eta}-1\right)$.

One example of a system whose Hamiltonian can be transformed canonically into the form of Eq. is an ideal gas. At any given point in time, its Hamiltonian has the form, $\mathcal{H}(\mathbf{P})\equiv\sum_\eta \mathcal{H}_\eta(P_\eta)=\frac{1}{2}\sum_\eta P_\eta^2$, which is the Hamiltonian of a set of independent free particles. A free particle Hamiltonian can be transformed canonically into a harmonic oscillator Hamiltonian [Glass, 1977]; therefore, it can also be transformed into action-angle coordinates.

As discussed in Sec. 1.1.2, in the limit $T\to 0$ the Hamiltonian of every classical dynamical system either takes the same form as a set of weakly interacting harmonic oscillators or as an ideal gas, or as a combination of both. Therefore, all classical dynamical systems that are subject to an uncertainty principle, $\Delta_{\mathrm{Q}}\Delta_{\mathrm{P}}>{h_?}>0$, obey Bose-Einstein statistics in the T\to 0 limit as a consequence of the probability domain quantization discussed in Sec. 3.

As the derivation of the Bose-Einstein distribution presented in Sec. 5.1 makes clear, when an uncertainty principle applies, the Maxwell-Boltzmann and Bose-Einstein distributions are perfectly compatible with one another in the T\to 0 limit. In that limit, a classical system's energy distribution can be expressed either as a Maxwell-Boltzmann distribution or as a Bose-Einstein distribution, depending on which choice of coordinates and their conjugate momenta are used to specify the microstate.

Comments