14. Single particle states

https://doi.org/10.48550/arXiv.2403.13981

As mentioned above, one can change the minimal (${\Nelec}$-fold) basis of ${\hilbert_\zeta}$ by rotating it. As shown in Sec. 7, one can also change from a position ($x$) representation, or basis, of each basis vector of ${\hilbert_\zeta}$ to a wavevector ($k$) representation of each state, i.e.,

\begin{align*} \ket{\varphi_i(\zeta)} &= \intone \dd{x}\varphi_i(x;\zeta)\ket{x} \end{align*}

where ${\varphi_i(x;\zeta)\equiv \braket{x}{\varphi_i(\zeta)}}$ and ${\braket{x}{x'}=\delta(x-x')}$, or

\begin{align*} \ket{\varphi_i(\zeta)} &= \intone \dd{k}\ftsvarphi_i(k;\zeta)\ket{k}, \end{align*}

where ${\ftsvarphi_i(k;\zeta)\equiv \braket{k}{\varphi_i}}$, ${\braket{k}{k'}=\delta(k-k')}$, and ${\braket{x}{k}\equiv e^{ikx}/\sqrt{2\pi}}$. One can also use a mixed position/wavevector representation, as we have already seen in Sec. 7.5.1, and as is shown in Appendix G. This is a common way to exploit the translational symmetry of a crystal: Roughly speaking, real space is used to describe the electronic structure in a single primitive unit cell $\Omega$ and reciprocal space is used to describe variations of the structure between different primitive cells.

The polarization current can be calculated in any minimal basis ${\left\{\ket{\varphi_i(\zeta)}\right\}_{i=1}^{\Nelec}}$ of ${\hilbert_\zeta}$ and for any choice of the basis in which each vector ${\ket{\varphi_i(\zeta)}}$ is represented as a function. All that is needed is the position operator for the chosen representation, which is simply $x$ when working with ${\varphi_i(x;\zeta)\equiv \braket{x}{\varphi_i(\zeta)}}$ and is ${i\partial/\partial k}$ when working with ${{\ftsvarphi}_i(k;\zeta)\equiv \braket{k}{\varphi_i(\zeta)}}$.

14.1 Bloch and Wannier functions

As discussed in Sec. 7.5.1 and Appendix G, when describing the bulk of a material theoretically, or when simulating the bulk of a material, it is common to use Born-von Kármán periodic boundary conditions [Born and von Kármán, 1912]. This is equivalent to representing the material’s bulk in a torus $\onetorus$ (or $\onetorus^m$ in $m$ dimensions), which obviates the need to deal with surfaces.

If the material is a crystal, the absence of any surfaces means that the distributions of electrons and nuclei have the exact $\volume$-periodicity of the crystal at mechanical equilibrium; and at thermal equilibrium their time-averaged distributions are $\volume$-periodic.

In solid state physics it is common to use $\onetorus$ to study the electronic subsystem in the presence of a $\volume$-periodic distribution of static nuclei. In the limit of heavy nuclei, the energy of interaction between electrons and nuclei is

\begin{align*} (n,\vext)\equiv \int_{\onetorus}n(x)\vext(x)\dd{x}, \end{align*}

where $\vext$ is the $\volume$-periodic external potential from the positively charged nuclei. It is also common to simplify the electronic structure of the crystal by expressing the electron density $n(x)$ as a sum of contributions from the eigenfunctions of an effective one-electron Hamiltonian $\hamsmallx$, which inherits $\volume$-periodicity from ${\vext}$. These eigenfunctions are often interpreted physically as real states that single electrons, or pairs of electrons with opposite spins, occupy. However it is well known that this interpretation is not justified by rigorous theory.

As discussed in Appendix G, the elementary eigenfunctions of a $\volume$-periodic operator, ${\hamsmallx:\lebesgue(\onetorus)\to\lebesgue(\onetorus)}$, are known as Bloch functions. They have the form

\begin{align*} \bloch_{\alpha k}(x)\equiv\braket{x}{\bloch_{\alpha k}}=e^{ik x}u_{\alpha k}(x), \end{align*}

where the Bloch state ${\ket{\bloch_{\alpha k}}}$ is an eigenstate of

\begin{align*} \hamsmall\equiv\int_\onetorus\dd{x}\hamsmallx\dyad{x} \end{align*}

with eigenvalue ${\epsilon_{\alpha k}}$ and ${u_{\alpha k}(x)=\braket{x}{\pbloch_{\alpha k}}}$ has the crystal’s $\volume$-periodicity. The definition of $\hamsmall$ in terms of $\hamsmallx$ implies the inverse relation ${\hamsmallx\equiv\expvaltwo{\hamsmall}{x}}$.

Each Bloch function ${\bloch_{\alpha k}}$ can be chosen to be periodic in reciprocal space, and such that the corresponding periodic Bloch function ${u_{\alpha k}}$ is real-valued for all wavevectors ${k}$ in the first Brillouin zone.

Both ${\bloch_{\alpha k}(x)}$ and ${u_{\alpha k}(x)}$ are delocalized over the entirety of $\onetorus$, and it follows from the eigenvalue equation,

\begin{align*} \hamsmall\ket{\bloch_{\alpha k}}=\epsilon_{\alpha k}\ket{\bloch_{\alpha k}}\iff \hamsmallx\bloch_{\alpha k}=\epsilon_{\alpha k}\bloch_{\alpha k}, \end{align*}

that ${u_{\alpha k}}$ is an eigenfunction of the $k$-dependent Hamiltonian ${\hamsmallx_k\equiv e^{-ikx}\hamsmallx e^{ikx}}$.

In the large-torus limit, $\hamsmallx_k$ varies continuously with $k$. Therefore the eigenstates at different values of $k$ can be labelled such that ${u_{\alpha k}}$ and its eigenvalue ${\epsilon_{\alpha k}}$ vary continuously with $k$. When the eigenvalues are plotted as functions of $k$, the set of points ${\{(k,\epsilon_{\alpha k})\}_k}$ forms a surface. Note that I use a subscript $k$ on the parentheses to indicate that different elements of the set correspond to different values of $k$; and the absence of a subscript $\alpha$ means that $\alpha$ is the same for all elements of the set. Therefore ${\{(k,\epsilon_{\alpha k})\}_k}$ and ${\{(k,\epsilon_{\beta k})\}_k}$ are different surfaces if ${\alpha\neq\beta}$. Each surface, or set of intersecting surfaces, is usually referred to as a band because its projection onto the energy (eigenvalue) axis is an interval, or ‘band’, of energies.

Let us assume that the set of all Bloch functions, ${\{\bloch_{\alpha k}\}_{\alpha k}}$, has been chosen to be orthonormal. Then, in an insulator, the electron density can be expressed as

\begin{align*} n(x) &= \sum_{\alpha k}\abs{\bloch_{\alpha k}(x)}^2 = \expvaltwo{\left(\sum_{\alpha k}\dyad{\bloch_{\alpha k}}\right)}{x}, \end{align*}

where the sum over ${\alpha k}$ is a sum over a finite number of ‘occupied’ states. For our purposes, the meaning of a state being occupied is simply that it contribute to $n$; and will assume that each state is either vacant or occupied by one electron.

In Sec. 7.5.1 it was shown how $\Jconv$ could be calculated directly from the set of PBFs using Eq. (28). Now let us try to calculate it using Eq. (66); and let us also try to relate it more directly to the depiction, in Fig. 13, of the process by which a polarization current arises mathematically.

The Bloch states are an orthonormal basis for the $\hilbert$-representation of $n$ and, by symmetry, there are the same number of Bloch state centers in each primitive unit cell of the crystal. Therefore we could use Eq. (66) to calculate $\Jconv$ from the velocities of their centers in one particular primitive cell. Although this would yield the correct value of $\Jconv$, because the Bloch functions are delocalized, this mapping of the density onto a set of point charges is conceptually inconsistent with the mapping envisaged in Sec. 12.6. There we assumed a mapping of the charge density, ${\rhom=-e n}$, onto charge packets of microscopic widths, rather than delocalized distributions.

We will now look for a microscopically-localized basis for the $\hilbert(t)$-representation of ${n(x;\zeta(t))}$. Let us begin by expressing the density as ${n(x;\zeta) = \sum_\alpha n_{\alpha}(x;\zeta)}$, where ${n_\alpha(x;\zeta)\equiv \sum_k \abs{\bloch_{\alpha k}(x;\zeta)}^2}$ is the density from all Bloch states that contribute to band $\alpha$. We know that each ${n_\alpha}$ is ${\hilbert(t)}$-representable because

\begin{align*} \P_\alpha(\zeta)\equiv\sum_k\dyad{\bloch_{\alpha k}(\zeta)} \end{align*}

is the projector onto its $\hilbert$-representation. Therefore the polarization current can be calculated as the sum ${\Jconv\equiv \sum_\alpha \Jconv_\alpha}$, where ${\Jconv_\alpha}$ is the polarization current from the variation of ${n_{\alpha}(x;\zeta)}$ with $\zeta$. Matters become more complicated when bands cross one another [Souza et al., 2001], but I will assume that, given any two bands, the band that is lower in energy at any given wavevector is also lower in energy at every other wavevector.

Let us focus on the contribution $\Jconv_\alpha$ of band $\alpha$ to $\Jconv$. We can transform the Bloch functions to a more localized set with the generalized Fourier transform,

\begin{align*} w_{\alpha X}(x;\zeta) \equiv \frac{1}{\sqrt{\Nunitcell}}\sum_{k} e^{-ikX}e^{i\vartheta_\alpha(k)}\bloch_{\alpha k}(x;\zeta), \tag{68} \end{align*}

where $\Nunitcell$ is the number of primitive unit cells in $\onetorus$; $X$ identifies a particular position ($x$) within the torus; and $\vartheta_\alpha(k)$ is any $x$-independent constant or function of $k$. The function ${w_{\alpha X}}$, which is localized in real space, is known as a Wannier function [Wannier, 1937; Blount, 1962; Ferreira and Parada, 1970; Kohn, 1973].

Let ${\wannierxset(X_0)}$ denote the set of all images of a point ${X_0\in\onetorus}$ under translations by the crystal’s lattice vectors, i.e.,

\begin{align*} \wannierxset(X_0)\equiv\{X_0+m\volume: 0\leq m \leq \Nunitcell-1\}. \end{align*}

Then it can be shown that, for any choice of ${X_0\in\onetorus}$, the set

\begin{align*} \wannierset(X_0)\equiv\{w_{\alpha X}: X\in\wannierxset(X_0)\}, \end{align*}

is both orthonormal and satisfies

\begin{align*} \sum_k \abs{\bloch_{\alpha k}(x;\zeta)}^2 = \sum_X\abs{w_{\alpha X}(x;\zeta)}^2 = n_\alpha(x;\zeta). \end{align*}

Therefore, the set ${\{\ket{w_{\alpha X}(\zeta)}:X\in\wannierxset(X_0)\}}$ of $\Nunitcell$ Wannier states,

\begin{align*} \ket{w_{\alpha X}(\zeta)}\equiv\int_\onetorus \dd{x} w_{\alpha X}(x;\zeta) \ket{x}, \end{align*}

is a minimal orthonormal basis of the ${\hilbert}$-representation of $n_\alpha$, and ${\P_{\alpha}(\zeta)}$ can be expressed as

\begin{align*} \P_{\alpha}(\zeta) = \sum_{X\in\wannierxset(X_0)} \dyad{w_{\alpha X}(\zeta)}. \end{align*}

Because a finite integer multiple of $\volume$ separates any two of the points in ${\wannierxset(X_0)}$, each primitive unit cell contains exactly one of them. Furthermore, by substituting ${\bloch_{\alpha k}= e^{ikx} u_{\alpha k}}$ into Eq. (68) and using the periodicity of ${u_{\alpha k}}$, it can be shown that any given Wannier function in set ${\wannierset(X_0)}$ transforms into any other under a translation by an integer multiple of $\volume$. Therefore each primitive cell contains the center of exactly one of element of ${\wannierset(X_0)}$, where the center of ${w_{\alpha X}}$ is

\begin{align*} \bar{x}_{\alpha X}(\zeta)\equiv \int_\onetorus x \abs{w_{\alpha X}(x;\zeta)}^2\dd{x}. \end{align*}

This means that we have decomposed ${n_\alpha}$ into a periodic array of identical localized packets of electron density, ${n_{\alpha X}(x)\equiv\abs{w_{\alpha X}(x)}^2}$. It follows from Eq. (66) that

\begin{align*} \Jconv_\alpha = \frac{e}{\volume} \sum_{X}\dv{\bar{x}_{\alpha X}}{t} =\frac{e\dot{\zeta}}{\volume} \sum_{X}\dv{\bar{x}_{\alpha X}}{\zeta}. \end{align*}

The degree to which the Wannier functions are localized depends on the choice of $X_0$ and on the choice of the function ${\vartheta_\alpha(k)}$ in Eq. (68), but the most localized set, which is commonly known as the set of maximally-localized Wannier functions (MLWF) [Marzari and Vanderbilt, 1997; Marzari et al., 2012], is obtained when ${\vartheta_\alpha}$ is a $k$-independent constant, in which case ${X_0}$ is one of the Wannier centers and ${\wannierxset(X_0)}$ is the set of all of the Wannier centers (see [Ferreira and Parada, 1970] and Appendix H).

14.1.1 Interpretation of Wannier functions

Wannier functions, whether maximally localized or not, are not specific to quantum mechanics and there is no obvious reason to attach any particular physical meaning to them.

If a density is ${\hilbert}$-representable, its ${\hilbert}$-representation has an infinite number of orthonormal minimal bases. Among those, there must exist a maximally localized basis and either a maximally delocalized basis. This is a mathematical observation which does not imply that elements of these extreme sets have any further meanings or any physical meanings. Therefore claims that MLWFs have greater physical significances than elements of other minimal bases should be substantiated and the precise physical meanings attributed to them should be clarified.

The Wannier states of band $\alpha$ are eigenstates of any operator of the form

\begin{align*} \Dop_1\equiv \sum_{X} d_{X} \dyad{w_{\alpha X}}, \end{align*}

which means that the Wannier functions are eigenfunctions of the generally-nonlocal integral operator whose kernel is ${\Df_1(x';x)\equiv\mel{x'}{\Dop_1}{x}}$. However, because electrons want to delocalize rather than localize, Wannier functions are not, in general, either the eigenfunctions, or approximately equal to the eigenfunctions, of an operator that could reasonably be interpreted as the Hamiltonian of a real or idealized physical system. Therefore if, in a many-particle system, there existed single-particle states that could be regarded as ‘physical’, in the sense that they resembled states that individual particles would like to occupy in a system with simplified energetics (e.g., mean field interactions), they would not be localized, in general, and certainly not by their mutual repulsion. For example, changing a single-particle Hamiltonian, ${\hamsmall\equiv\hat{t}+\vextop}$, by adding a repulsive mean-field Coulomb potential from one or more localized clouds of negative charge to ${\vextop<0}$, would not make its eigenfunctions more localized, in general.

Furthermore, there is no ‘natural’ or right way to partition the density into the same number of packets as there are electrons. Therefore $\vextop$, which together with $\Nelec$ determines the character of chemical bonds, and which is usually the only localizing influence on electrons, does not localize partitions of the density individually. It localizes the density as a whole. There is nothing within rigorous physical or chemical theory to suggest that it bestows a density with a substructure of localized partitions.

I emphasize this point because it has been claimed that Wannier functions, and MLWFs in particular, can provide insight into chemical bonding by elucidating the substructure of the electron density [Marzari and Vanderbilt, 1997; Marzari et al., 2012]. However, this claim has not been justified theoretically, but by references to the chemistry literature: It was claimed in [Marzari and Vanderbilt, 1997] and [Marzari et al., 2012] that chemists use localized molecular orbitals, which are the analogs of MLWFs for molecules, for this purpose. However, the papers cited, namely [Boys, 1960], [Foster and Boys], [Foster and Boys], and [Edmiston and Ruedenberg, 1963], do not justify using localized orbitals to analyse bonds, and they did not introduce them to represent the parts of the electron density that are most important to bonding. They introduced them to deal more efficiently with those parts of the electron density that are least important to bonding, or to changes in bonding.

For example, when one is interested in a reaction that involves one reactive part of an otherwise-inert large molecule, it is unnecessary and computationally expensive to treat all parts of the molecule as reactive. One can freeze the electronic structure of the inert part and calculate its effects on the reactive part using methods that are much more computationally efficient than treating the whole molecule as reactive would be.

The same trick can be played when studying multiple large molecules, which are mostly the same, but have different functional groups in one relatively-small region. After calculating the electronic structure of one of the molecules, it should not be necessary to recalculate it from scratch for another molecule: it is more efficient to reuse parts of the density that are unaffected by the differences in functional groups. Localized orbitals were introduced to facilitate this partitioning of the electronic structure [Boys, 1960; Foster and Boys; Foster and Boys]. For example, in calculations based on density functional theory, a boundary can be chosen between the reactive part of the molecule, $\mathcal{R}$, and the unreactive part, $\mathcal{U}$, and the density can be partitioned using the centers $\bar{x}_\alpha$ of the localized functions ${w_{\alpha}}$ as

\begin{align*} n(x) = \sum_{\bar{x}_\alpha \in \mathcal{R}} \abs{w_\alpha(x)}^2 + \sum_{\bar{x}_\alpha \in \mathcal{U}} \abs{w_\alpha(x)}^2 \end{align*}

The more localized the functions $w_\alpha$ are, the more well-defined the boundary is.

14.2 The natural single particle substructure of the density

The electron density ${n(\rvec)}$ does possess a ‘natural’ substructure of single-particle states (${\varphi_\alpha}$) and their ‘occupancies’ ($\occ_\alpha$) [Coleman, 1963; McWeeny, 1960; Löwdin, 1955], which satisfy

\begin{align*} n(\rvec) = \sum_{\alpha}\occ_\alpha \abs{\varphi_\alpha(\rvec)}^2, &\quad \braket{\varphi_\alpha}{\varphi_\beta}=\delta_{\alpha\beta}, \\ \sum_\alpha\occ_\alpha = \Nelec, &\quad \occ_\alpha \leq 1,\; \forall \alpha, \end{align*}

and we will assume that they are indexed in order of decreasing occupation number, such that

\begin{align*} \alpha\leq \beta \iff\occ_\alpha\geq\occ_\beta. \end{align*}

These natural orbitals are the normalized eigenstates of the 1-particle reduced density matrix. Their properties, some of which are discussed in Appendix I, suggest that they are the only single-particle states that should be regarded as characteristics, or substructural components, of a many-particle state.

As an illustration of the physical meaning of natural orbitals, it is proved in Appendix I that the energy of a normalized ${\Nelec}$-particle pure state, ${\Psi}$, can be expressed exactly as

\begin{align*} E&\equiv\expvaltwo{\hat{H}}{\Psi} =\sum_\alpha \occ_\alpha \energy_{\alpha} + \sum_\alpha \sum_{\beta\geq\alpha}\sqrt{\occ_\alpha\occ_\beta} \, w_{\alpha\beta} \tag{69} \\ &=\sum_\alpha\occ_\alpha \left( \energy_{\alpha} +\vmf_\alpha\right) + \sum_\alpha \sum_{\beta>\alpha}\sqrt{\occ_\alpha\occ_\beta} \, w_{\alpha\beta} \tag{70} \\ &=\sum_\alpha\occ_\alpha \left( \energy_{\alpha} + \frac{1}{2}\sum_{\beta}\sqrt{\frac{\occ_\beta}{\occ_\alpha}}w_{\alpha\beta} \right), \tag{71} \end{align*}

where the sums over $\alpha$ and $\beta$ are over the set of all natural orbitals, which is an infinite set. Explanations of the symbols on the right hand side of Eq. (70) will now be provided, and interpretations of them will be suggested. It will be important to note that the $\Nelec$-particle wavefunction ${{\Psi}}$ can be expressed exactly as the infinite sum,

\begin{align*} \Psi(\rvecsub{1}\cdots\rvecsub{\Nelec}) =\sum_\alpha c_\alpha\varphi_\alpha(\rvecsub{1})\Theta_\alpha(\rvecsub{2}\cdots\rvecsub{\Nelec}), \tag{73} \end{align*}

where ${\displaystyle \abs{c_\alpha}^2=\lambda_\alpha=\frac{\occ_\alpha}{\Nelec}}$; ${\sum_\alpha\lambda_\alpha=1}$; and ${\Theta_\alpha}$ is an eigenfunction of the ${(\Nelec-1)}$-particle reduced density matrix, with the same eigenvalue $\lambda_\alpha$ as ${\varphi_\alpha}$. Functions $\Theta_\alpha$ and ${\varphi_\alpha}$ are duals of one another, in the sense that the contraction of ${\Psi}$ by ${\varphi_\alpha}$ is ${c_\alpha\Theta_\alpha}$ and the contraction of ${\Psi}$ by ${\Theta_\alpha}$ is ${c_\alpha\varphi_\alpha}$. That is,

\begin{align*} \varphi_\alpha\rfloor\Psi(\rvecsub{1}\cdots\rvecsub{N-1}) &\equiv \int\ddpow{3}{r}\varphi_\alpha^*(\rvec)\Psi(\rvec,\rvecsub{1}\cdots\rvecsub{N-1}) \\ &= c_\alpha\Theta_\alpha(\rvecsub{1}\cdots\rvecsub{N-1}) \end{align*}

and

\begin{align*} \Theta_\alpha\rfloor\Psi(\rvec) &\equiv \int\ddpow{3}{r_1}\cdots\int\ddpow{3}{r_{N-1}}\Theta_\alpha^*(\rvecsub{1}\cdots\rvecsub{N-1}) \\ &\times\Psi(\rvec,\rvecsub{1}\cdots\rvecsub{N-1}) = c_\alpha\varphi_\alpha(\rvec). \end{align*}

Note that the derivation of Eqs Eq. (72) in Appendix I applies to any classical or quantum mechanical state ${\Psi}$ whose position probability density function is ${\Psi^*\Psi}$; and the property ${\occ_\alpha\leq 1}$ arises from the normalization ${\braket{\Psi}{\Psi}=1}$: The fact that ${\{\varphi_\alpha\Theta_\alpha\}}$ is an orthonormal basis implies that

\begin{align*} \frac{\occ_\alpha}{N}=\abs{c_\alpha}^2=\abs{\braket{\varphi_\alpha\Theta_\alpha}{\Psi}}^2\leq 1. \end{align*}

14.2.1 Independent electron energy, $\varepsilon_\alpha$

The first term on the right hand side of Eq. (69) is an occupation-weighted sum of

\begin{align*} \energy_\alpha\equiv\expvaltwo{\hamsmall}{\varphi_\alpha}= \underbrace{\expvaltwo{\kinetic}{\varphi_\alpha}}_{\displaystyle \vphantom{\vext_\alpha}t_\alpha}+\underbrace{\expvaltwo{\vextop}{\varphi_\alpha}}_{\displaystyle \vext_\alpha}, \end{align*}

where ${\hamsmall\equiv \kinetic + \vextop}$ would be the Hamiltonian of a single electron if the remaining ${\Nelec-1}$ electrons were not present: $\kinetic$ is the single electron kinetic energy operator and ${\vextop}$ is the operator for the energy of interaction between an electron and the nuclei. The nuclei are assumed to be static.

Non-interacting electrons:

If all of the orbital coupling energies, $w_{\alpha\beta}$, vanished, the occupation numbers at the energy minimum (i.e., the ground state) would be

\begin{align*} \occ_\alpha = \begin{cases} 1,\;\text{if}\;\alpha\leq \Nelec, \\ 0,\;\text{if}\;\alpha> \Nelec, \end{cases} \end{align*}

and the occupied orbitals would be those with the lowest energies. In words, the $\Nelec$ orbitals with the lowest energies would be occupied and all other orbitals would be vacant.

The $\Nelec$ occupied natural orbitals of the ground state are those that minimize

\begin{align*} \sum_{\alpha=1}^{\Nelec} \energy_\alpha= \sum_{\alpha=1}^{\Nelec} \expvaltwo{\hamsmall}{\varphi_\alpha}, \tag{74} \end{align*}

while preserving orthonormality.

Orthonormality of the ground state natural orbitals implies that each term in Eq. (74) is a stationary value of

\begin{align*} \frac{\expvaltwo{\hamsmall}{\varphi}}{\braket{\varphi}{\varphi}}, \end{align*}

which means that each natural orbital ${\varphi_\alpha}$ is an eigenstate of ${\hamsmall}$.

14.2.2 Interaction energy

The interaction energy, which is responsible for electrons moving between orbitals, is

\begin{align*} \sum_{\alpha, \beta\geq\alpha}\! \sqrt{\occ_\alpha\occ_\beta}w_{\alpha\beta} = \sum_\alpha \!\occ_\alpha\left(\vmf_\alpha + \sum_{\beta>\alpha}\sqrt{\frac{\occ_\beta}{\occ_\alpha}}w_{\alpha\beta}\right), \tag{75} \end{align*}

where ${w_{\alpha\beta}\equiv 2\Re\left\{\mel{\varphi_\alpha}{\what_{\alpha\beta}}{\varphi_\beta}\right\}}$,

\begin{align*} \what_{\alpha\beta}(\rvec) \equiv \int\ddpow{3}{r_1}&\cdots\int\ddpow{3}{\rsub{\Nelec-1}}\bar{\Theta}_\alpha(\rvecsub{1}\cdots \rvecsub{\Nelec-1}) \\ \times &\left(\sum_{j=1}^{\Nelec-1}\hamtwo(\rvec,\rvecsub{j})\right) \Theta_\beta(\rvecsub{1}\cdots \rvecsub{\Nelec-1}), \end{align*}

and ${\hamtwo(\rvec,\rvecsub{j})}$ denotes the energy of interaction between an electron at $\rvec$ and an electron at ${\rvecsub{j}}$.

Mean-field interaction:

The quantity ${\vmf_\alpha}$ on the right hand sides of Eq. (70) and Eq. (75) is

\begin{align*} \vmf_\alpha = \frac{1}{2}\int\ddpow{3}{r}\abs{\varphi_\alpha(\rvec)}^2\what_{\alpha\alpha}(\rvec), \tag{76} \end{align*}

which is one half of the energy of interaction of an electron in orbital ${\varphi_\alpha}$ with the mean field charge density of ${N-1}$ electrons occupying the dual state, ${\Theta_\alpha}$, of ${\varphi_\alpha}$. That charge density is

\begin{align*} \rho_\alpha^{\scriptscriptstyle (\Nelec-1)}(\rvec)\equiv -e\,\densityn_\alpha(\rvec), \end{align*}

where ${\densityn_\alpha(\rvec)}$ is the number density of ${N-1}$ electrons in state ${\Theta_\alpha}$. Just as one electron’s share of its energy of interaction with another electron is half of it, the share of the mean-field interaction energy ${2\vmf_\alpha}$ that belongs to the electron in orbital ${\varphi_\alpha}$ is ${\vmf_\alpha}$.

The expectation value of the interaction energy of an electron in orbital $\varphi_\alpha$ with the ${N-1}$ electrons in a state ${\Theta_\alpha}$ would be ${2\vmf_\alpha}$ if the probability density of the single electron did not depend on the configuration of the ${N-1}$ electrons, and the probability density of the ${N-1}$ electrons did not depend on the position of the single electron. In other words, ${2\vmf_\alpha}$ is an expectation value that is calculated without accounting for the correlation between the motion of the electron in orbital $\varphi_\alpha$ and the motions of the electrons in state ${\Theta_\alpha}$.

Correlation correction:

One way to think of ${w_{\alpha\beta}}$ when ${\alpha\neq\beta}$ is as a correction to ${\vmf_\alpha}$ that accounts for correlation: It accounts for the fact that ${\vmf_\alpha}$ has been calculated from a $N$-particle pdf of the form

\begin{align*} \pdfarg{\varphi}_\alpha(\rvecsub{1})\pdfarg{\Theta}_\alpha(\rvecsub{2}\cdots\rvecsub{\Nelec}) \end{align*}

rather than one of the form

\begin{align*} \pdfarg{\varphi|\Theta}_\alpha(\rvecsub{1}|\rvecsub{2}\cdots\rvecsub{\Nelec})&\pdfarg{\Theta}_\alpha(\rvecsub{2}\cdots\rvecsub{\Nelec}) \\ &= \pdfarg{\varphi}_\alpha(\rvecsub{1})\pdfarg{\Theta|\varphi}_\alpha(\rvecsub{2}\cdots\rvecsub{\Nelec}|\rvecsub{1}), \end{align*}

where ${\pdfarg{\varphi|\Theta}_\alpha}$ and ${\pdfarg{\Theta|\varphi}_\alpha}$ are conditional pdfs.

Another way to think of ${w_{\alpha\beta}}$ is as a coupling of orbital ${\varphi_\alpha}$ to orbital ${\varphi_\beta}$, which is mediated by their dual states ${\Theta_\alpha}$ and ${\Theta_\beta}$, respectively. If it were a bare (unmediated) interaction, it would be ${\mel{\varphi_\alpha}{\what}{\varphi_\beta}}$ rather than ${\mel{\varphi_\alpha}{\what_{\alpha\beta}}{\varphi_\beta}}$.

14.2.3 Hartree-Fock approximation

In Appendix I it is shown that, within the Hartree-Fock approximation, the electron density is

\begin{align*} n(\rvec)=\sum_{\alpha=1}^{\Nelec}\abs{\varphi_\alpha(\rvec)}^2, \end{align*}

and ${\what_{\alpha\beta}(\rvec)}$ simplifies to

\begin{align*} \what_{\alpha\beta}(\rvec) = \int\what(\rvec,\rvec')\left[n(\rvec')-n_\alpha(\rvec')-n_\beta(\rvec')\right]\ddpow{3}{r'}, \end{align*}

where ${n_\alpha(\rvec)=\abs{\varphi_\alpha(\rvec)}^2}$ and ${n_\beta(\rvec)=\abs{\varphi_\beta(\rvec)}^2}$. In other words, within the Hartree-Fock approximation, ${\what_{\alpha\beta}(\rvec)}$ is the energy of interaction of an electron at ${\rvec}$ with the mean field charge density from all of the electrons except those occupying orbitals ${\varphi_\alpha}$ and ${\varphi_\beta}$. The ‘share’ of each of the electrons in orbitals ${\varphi_\alpha}$ and ${\varphi_\beta}$ of their coupling energy is

\begin{align*} \frac{1}{2}w_{\alpha\beta}=\Re\left\{\mel{\varphi_\alpha}{\what_{\alpha\beta}}{\varphi_\beta}\right\}. \end{align*}

14.2.4 Interpreting natural orbitals and their energies

Before discussing how natural orbitals and their occupation numbers should be interpreted, I wish to reiterate that Eq. (72) are derived in Appendix I without making any non-classical assumptions. Therefore all of the mathematics in this section is compatible with the particles being identical and fast-moving (${\implies}$ indistinguishable) billiard balls.

I also wish to point out that it may not always be useful or appropriate to interpret the mathematics in this section as meaning that the particles ‘occupy’ orbitals. Sometimes it might be more appropriate to regard the orbitals as nothing more than basis functions. Having said this, I will return to referring to the particles as electrons that occupy orbitals.

When electrons interact with one another, the natural orbitals are not eigenstates of the Hamiltonians ${\hamsmall_\alpha}$ or ${\hamsmall}$, or of any other Hamiltonian. This is to be expected of ‘physical’ single-particle states because Hamiltonian eigenstates are stationary states; and the states that individual electrons occupy cannot be stationary if they interact with other electrons while occupying them. Interactions perturb the electron occupying orbital ${\varphi_\alpha}$ and, sooner or later, displace it from that state to another one.

Each occupation number $\occ_\alpha$ can be interpreted as either the fraction of time for which the $\alpha^\text{th}$ natural orbital is occupied, or the probability that it is occupied at any given time. If it is possible for an electron occupying natural orbital $\varphi_\alpha$ to be displaced from it, the fraction of time for which ${\varphi_\alpha}$ is occupied must be less than one. This is one way to understand why the occupation numbers are less than one for interacting particles.

The forms of Eq. (72) suggest that they can be interpreted within a quasi-independent-electron picture as follows: When orbital ${\varphi_\alpha}$ is occupied, the energy of the electron occupying it is the sum of the orbital energy, $\energy_\alpha$, and its share of its energies of interaction with electrons in other orbitals. An equivalent statement is that its energy is the sum of $\energy_\alpha$ and its share of the energy of its interaction with the set of ${\Nelec-1}$ electrons occupying state ${\Theta_\alpha}$. State ${\Theta_\alpha}$ is orthogonal to $\varphi_\alpha$, but is not orthogonal to all other natural orbitals. Eq. (69) expresses the total energy as an occupation-weighted sum of orbital energies plus a sum over orbital pairs, ${\{\varphi_\alpha,\varphi_\beta\}}$, of the energy of mediated coupling between them, weighted by the geometric mean of their occupation numbers, ${\sqrt{\occ_\alpha\occ_\beta}}$.

An important feature of Eq. (69), and the definition of $\energy_\alpha$, is that the total kinetic energy is exactly

\begin{align*} \expvaltwo{\hat{T}}{\Psi}=\sum_\alpha\occ_\alpha\expvaltwo{\,\hat{t}\,}{\varphi_\alpha}, \end{align*}

which is simply the occupation-weighted sum of the natural orbitals’ individual kinetic energies. If ${\Psi}$ was expanded in terms of any other set of orbitals, ${\{\psi_\alpha\}}$, the kinetic energy would have the more complicated form,

\begin{align*} \expvaltwo{\hat{T}}{\Psi}= \sum_{\alpha}\sum_{\beta}C^*_{\alpha}C_\beta\mel{\psi_\alpha}{\,\hat{t}\,}{\psi_\beta}. \end{align*}

The natural orbitals are the only orbitals for which the kinetic energy of the interacting many-electron system can be expressed exactly as a weighted sum of single-orbital contributions.

Furthermore, if the magnitude of the electron-electron repulsion could be reduced gradually to zero, in the non-interacting limit the energy would become

\begin{align*} E=\sum_{\alpha=1}^\infty \occ_\alpha \expvaltwo{\hamsmall}{\varphi_\alpha}. \end{align*}

As discussed in Sec. 14.2.1, the occupied eigenstates of the noninteracting Hamiltonian, $\hamsmall$, are natural orbitals of the electronic ground state, $\Psigs$.

14.2.5 How localized are natural orbitals?

As discussed in Sec. 13.2 and Sec. 14.1.1, if an $\Nelec$-electron density ${n(\rvec)}$ is $\hilbert$-representable, it can be expressed as a sum of contributions from $\Nelec$ maximally localized orbitals, or as a sum of contributions from $\Nelec$ maximally delocalized orbitals. When a crystal’s bulk is represented in ${\onetorus^3}$, the density can be expressed as a sum of contributions from ${\Nelec}$ Bloch functions, which are infinitely delocalized in the sense that each one has an equal presence in every unit cell.

Therefore, unless the counterparts of Bloch functions in the bulks of real crystals (i.e., with surfaces, and not in a torus) are qualitatively different, almost all elements $\psi_\alpha$, of the maximally delocalized basis of the $\hilbert$-representation of a crystal’s electron density, are delocalized throughout the entire crystal. There might be exceptions near surfaces or other macroscopic heterogeneities, but, in a macroscopic crystal, the overwhelming majority of them would be spread throughout the entire crystal.

Let us disregard the possible exceptions; and, on the basis of what is known theoretically about the bulks of crystals under Born-von Kàrman boundary conditions, let us assume that, in any macroscopic material, there exists a $\Nelec$-fold set ${\{\psi_\alpha: 1\leq\alpha\leq\Nelec\}}$, almost all of whose elements are delocalized throughout the entire material, such that the material’s electron density is ${n=\sum_{\alpha=1}^{\Nelec}\abs{\psi_\alpha}^2}$.

If that is the case, there must exist infinite sets of equally delocalized or more delocalized orbitals ${\{\tpsi_\alpha\}}$, and their occupation numbers

\begin{align*} \left\{\occ_\alpha\in[0,1]:\sum_\alpha\occ_\alpha=\Nelec\right\}, \end{align*}

such that

\begin{align*} n(\rvec)=\sum_{\alpha=1}^{\infty}\occ_\alpha\big|\tpsi_\alpha(\rvec)\big|^2=\sum_{\alpha=1}^{\Nelec}\abs{\psi(\rvec)}^2 \end{align*}

and

\begin{align*} \sum_{\alpha=1}^{\infty}\occ_\alpha\myexpval{\kinetic}{\tpsi_\alpha}\leq \sum_{\alpha=1}^{\Nelec}\expvaltwo{\kinetic}{\psi_\alpha}. \end{align*}

The reason to point this out is that, for noninteracting ‘electrons’ (${\hamtwo=0\implies w_{\alpha\beta}=0, \;\forall \alpha,\beta}$), the energy is simply

\begin{align*} E=\sum_\alpha\occ_\alpha\energy_\alpha = \underbrace{(n,\vext)}_{\sum_\alpha\occ_\alpha\vext_\alpha}+\sum_\alpha\occ_\alpha t_\alpha. \end{align*}

Since ${(n,\vext)}$ is independent of the density’s substructure of orbitals and occupation numbers, if the ground state density $n_0$ were known, the natural orbitals and occupation numbers of the ground state would be those that minimized ${\sum_\alpha\occ_\alpha t_\alpha}$ under the constraint ${\sum_\alpha\occ_\alpha\abs{\varphi_\alpha}^2=n_0}$.

Delocalizing an orbital lowers its kinetic energy. Therefore, if a material (composed of noninteracting electrons) has a ground state density that can be expressed as a weighted sum of contributions from delocalized orbitals, its natural orbitals are no less delocalized than those orbitals.

Since the ground state natural orbitals of a material’s noninteracting electrons would be delocalized throughout the entirety of the material, a real material’s ground state natural orbitals would only be localized if their energy of mutual repulsion could be lowered by them localizing. I have not been able to rule this possibility out, but it seems unlikely to be the norm: Localization of orbitals would have to reduce the electrons’ energy of mutual repulsion by more than it increased their kinetic energies. This seems quite hard to achieve given the possibly-naïve expectation that, if all of the orbitals were spread across ${\Navogadro\approx 6.02\times 10^{23}}$ unit cells of a crystal, their kinetic energy would be lowered by a factor of ${\sim 10^{8}\approx \Navogadro^{\frac{1}{3}}}$ by reducing their average width to the length of a primitive lattice vector.

The literature on many body localization and related phenomena might shed some light on this issue [Abanin et al., 2019; Nandkishore and Huse, 2015; Oganesyan and Huse, 2007; Pal and Huse, 2010; Huse et al., 2014; D'Alessio et al., 2016]. Conversely, Eq. (69) or one of its variants might prove useful in studies of those phenomena.

14.2.6 How large are the largest occupation numbers, and how rapidly do they decay?

The questions of how close the largest occupation numbers ($\occ_1$, $\occ_2$, $\occ_3$, etc.) are to unity, and of how slowly ${\occ_\alpha}$ decays with increasing $\alpha$, appear to be of enormous fundamental importance for a variety of reasons [Giesbertz and van Leeuwen, 2013; Cioslowski and Strasburger; Cioslowski and Strasburger; Cioslowski and Prątnicki, 2019; Tognetti and Loos, 2016; Helbig et al., 2010; Goedecker and Umrigar, 1998]. They might even be related to the question of which number densities are $\hilbert$-representable. However, although some calculations of natural orbitals and their occupation numbers have been performed for simple molecules and small contrived systems [Tognetti and Loos, 2016; Giesbertz and van Leeuwen, 2013; Cioslowski and Strasburger; Cioslowski and Strasburger; Cioslowski and Prątnicki, 2019], little seems to be known about the occupation numbers in a macroscopic material.

One way to formulate the set of questions is to define the function,

\begin{align*} F_\occ(s)\equiv \frac{1}{\Nelec}\sum_{\alpha< s\Nelec}\occ_\alpha; \end{align*}

and to ask what a plot of ${F_\occ}$ versus $s$ would look like; how it would depend on the value of $\Nelec$; and how it would depend on other characteristics of the physical system.

It is known that ground state occupation numbers become equal to one when interactions between particles are turned off, or when they are replaced with mean-field interactions, as in the drastic Hartree-Fock approximation discussed in Appendix I.4.6 and at the end of Sec. 14.2.2. Therefore, in those cases, the value of ${F_\occ}$ would be one for ${s\leq 1}$ and zero for ${s>1}$.

However, as discussed in Sec. 14.2.2, it is interactions that allow particles to pass between orbitals. Therefore, the fact that all occupation numbers are either $1$ or $0$ in a system without interactions should not be extrapolated to the conclusion that they are close to either $1$ or $0$ in a system of interacting particles. There does not appear to be any theoretical justification for such an extrapolation. One reason for emphasizing this point is that I use the assumption that I object to, or that I am wary of, as the foundation on which the derivation in the next subsection is built.

14.2.7 Classical derivation of the Fermi-Dirac distribution

As mentioned above, the derivation of Eq. (72) in Appendix I did not require either the $1$-particle Hamiltonian ${\hamsmallx=\expvaltwo{\hamsmall}{x}}$, or the interaction ${\what(x,x')=\mel{x}{\what}{x'}}$ to have any particular forms. They were only required to be functions of one and two positions, respectively; or to be operators that act on one and two positions, respectively. Therefore they are general expressions, which are perfectly consistent with classical physics.

In this subsection it will again be assumed that interactions are so weak that they are negligible. This assumption is not justified in the present context, but some of its consequences can provide insight. Under this assumption the expectation value of the energy is simply

\begin{align*} E&=\sum_\alpha\occ_\alpha\energy_\alpha. \end{align*}

Then, since the energy is determined by the $1$-particle density matrix, ${\Df_1(\rvec;\rvec')\equiv \mel{\rvec}{\Dop_1}{\rvec'}}$, which can be expressed in terms of its eigenfunctions and eigenvalues as

\begin{align*} \Df_1(\rvec;\rvec') &=\sum_\alpha\lambda_\alpha \varphi^*_\alpha(\rvec)\varphi_\alpha(\rvec') \\ &= \frac{1}{N}\sum_\alpha\occ_\alpha \varphi^*_\alpha(\rvec)\varphi_\alpha(\rvec'), \end{align*}

both $\Df_1$ and ${E}$ are determined by the sets ${\{\varphi_\alpha\}}$ and ${\{\occ_\alpha\}}$.

Now let us assume that the statistical state reflects a state of knowledge/ignorance; and that the only thing known about the physical system is that the value of $E$, the expectation value of its energy, is ${\tilde{E}}$. Then the natural orbitals and their occupation numbers are those that maximize uncertainty subject to two constraints: ${E=\tilde{E}}$ and the number of particles is $\Nelec$. The logical basis for this approach, which is due to Jaynes and Shannon [Jaynes; Jaynes; Shannon, 1948], is discussed in some detail in [Tangney, 2024].

The uncertainty associated with the occupancy of orbital $\varphi_\alpha$ (i.e., the uncertainty about whether it is occupied or empty) can be quantified by the Shannon entropy, which is the expectation value of the Shannon information [Shannon, 1948; Gu et al., 2021], i.e.,

\begin{align*} \entropy_\alpha = \occ_\alpha(-\log\occ_\alpha) + (1-\occ_\alpha)(-\log(1-\occ_\alpha)). \end{align*}

The expression on the right hand side of this equation is the probability that the state is occupied, multiplied by the quantity of information that would be revealed by the discovery that it is occupied (${-\log\occ_\alpha}$), plus the probability that it is unoccupied times the quantity of information that would be revealed by the discovery that it is unoccupied.

The expectation value of the quantity of information that would be revealed by discovering which $\Nelec$ of the orbitals are occupied is

\begin{align*} \entropy\left[\{\occ_\alpha\}\right]=-\sum_\alpha\left[\occ_\alpha\log\occ_\alpha + (1-\occ_\alpha)\log(1-\occ_\alpha)\right]. \tag{77} \end{align*}

The set ${\{\varphi_\alpha\}}$ is also unknown. Therefore, once the occupation numbers were known, a quantity of uncertainty that will be denoted by ${\tentropy[\{\varphi_\alpha\}]}$ would remain. This quantity will not play much of a role in what follows.

To find ${\{\occ_\alpha\}}$ and ${\{\varphi_\alpha\}}$, we must maximise uncertainty subject to the constraints that what is known about the state is true [Tangney, 2024; Shannon, 1948; Jaynes; Jaynes]. The logic underpinning this approach to deriving empirically unfalsifiable statistical models is discussed in detail in [Tangney, 2024]. In the present context the derivation entails finding a set ${\{\occ_\alpha\}}$ and a set ${\{\varphi_\alpha\}}$ for which

\begin{align*} \delta\bigg\{&\entropy\left[\{\occ_\alpha\}\right] + \tentropy\left[\{\varphi_\alpha\}\right] -\beta\left(\sum_\alpha \occ_\alpha\energy_\alpha[\varphi_\alpha]-\tilde{E}\right) \\ +&\beta\upmu\left(\sum_\alpha\occ_\alpha - N\right) - \sum_{\alpha\beta}\upeta_{\alpha\beta}\left(\braket{\varphi_\alpha}{\varphi_\beta}-\delta_{\alpha\beta}\right)\bigg\}=0, \end{align*}

where ${\beta}$, ${\beta\upmu}$, and ${\{\upeta_{\alpha\beta}\}}$ are Lagrange multipliers to enforce, respectively, the information constraints ${E=\tilde{E}}$ and ${\sum_\alpha\occ_\alpha=N}$, and the set of orthonormality constraints, ${\left\{\braket{\varphi_\alpha}{\varphi_\beta}=\delta_{\alpha\beta}, \forall\;\alpha,\beta\right\}}$.

The partial derivatives with respect to ${\occ_\alpha}$ of ${\energy_\alpha}$, ${\tentropy}$, and ${\braket{\varphi_\alpha}{\varphi_\beta}}$ all vanish. Therefore, at stationarity, the occupation numbers satisfy

\begin{align*} \pdvone{\occ_\alpha}\left[\entropy\left[\{\occ_\alpha\}\right] - \beta \sum_\alpha \occ_\alpha\energy_\alpha+\beta\upmu\sum_\alpha\occ_\alpha\right]=0. \end{align*}

It is straightforward to show from this expression and Eq. (77) that the occupation numbers of this possibly-classical system are

\begin{align*} \occ_\alpha = \frac{1}{1+e^{\beta(\energy_\alpha-\upmu)}}, \end{align*}

which means that the distribution of occupation numbers among the orbitals’ energies is a Fermi-Dirac distribution.

This derivation is only valid in the non-interacting limit, which is the limit that must be assumed to derive the Fermi-Dirac distribution for quantum mechanical particles. A classical derivation of the Bose-Einstein distribution is presented in [Tangney, 2024].

14.3 Localized orbital based models of bonding

Since the early days of quantum theory it has been common to approximate the expectation value of the energy of a set of electrons as ${E\approx \Esingleapprox+\Emf+\Erestapprox}$, where ${\Esingleapprox}$ approximates the expectation value $\Esingle$ of the sum of the electrons’ $1$-particle energies; $\Emf$ is a $1$-particle mean field approximation to the expectation value of the sum of $2$-particle interaction energies; and

\begin{align*} \Erestapprox\approx E-\Esingleapprox-\Emf\approx \Erest\equiv E-\Esingle-\Emf. \end{align*}

The true $1$-particle contribution, $\Esingle$, is the expectation value of the sum of the electrons’ kinetic energies and their energies of interaction with any external fields, such as the electric field from nearby nuclei.

The main reason to express $E$ as a sum of $1$-electron contributions plus a correction, and to make the correction as small as possible by augmenting the $1$-electron contributions with a mean field contribution, $\Emf$, is that $1$-electron energies and orbitals are relatively easy to calculate, understand, and relate to one another.

Their ease of use and relative conceptual simplicity are partly responsible for orbitals playing a pervasive role in the teaching of chemistry, and in the interpretation and communication of chemical research. Another reason for their widespread use is that orbital-based reasoning can undoubtedly be predictive; and orbital-based calculations of bond lengths, excitation energies, and other measurables, can be very accurate.

The prominence of orbitals within chemistry’s lexicon might also be, to some small degree, an artefact of the history of quantum theory, including some very early and wildly inaccurate speculations about the natures of chemical bonds [Lewis, 1916; Gillespie and Robinson, 2007]. Notable contributors to the proposal and subsequent refinement, or abandonment, of early models of electronic structure include Gilbert Lewis [Lewis, 1916], Irving Langmuir [Langmuir, 1919], Hund [Hund, 1926], Mulliken [Mulliken, 1928], Lennard-Jones [Lennard-Jones, 1929], and Linus Pauling [Pauling, 1926; Pauling, 1928; Pauling; Pauling; Pauling; Pauling; Pauling; Pauling, 1960]. Dramatic progress in our understanding of chemical bonding was made in the ${\sim 15}$ years following the publication of [Lewis, 1916], and many of the discoveries made remain useful and justified by theory. However, some artefacts of early models of bonding, which have never been justified experimentally or by rigorous theory, are still being presented in introductory chemistry textbooks as established features of reality.

An example is the idea that the instantaneous electronic structure of an atom can approximately, but realistically, be specified as a set of highly-localized nucleus-centered orbitals and their occupation states. In the simplest versions, only four occupation states are possible: If an orbital is not empty ($-$), it can be occupied by one ‘spin up’ electron ($\spinup$), one ‘spin down’ electron ($\spindown$), or both (${\spinupdown}$).

The premise that an atom’s electron cloud has this substructure of integer-occupied localized orbitals is then used as a basis for simplistic descriptions of chemical bonds. For example, an ionic bond is often explained as a Coulomb attraction between two ions, which are formed from two atoms when one or more of one atom’s electrons occupy one or more of another atom’s orbitals. However the charges of ions in a real ionic bond are not integer multiples of an electron’s charge, because the bonded ions are not formed by atoms transferring electrons between them. They are formed by the atoms transferring probability, or number density, between them,

Therefore, although the electron cloud of an isolated charge-neutral atom has a number density ${n}$ whose integral is an integer (i.e., the atom’s atomic number), the integral of ${n}$ over the electron cloud of a bonded ion is not an integer, either usually or in general, which implies that the charges of bonded ions are not integer multiples of $e$. For example, even in the canonically-ionic compound NaCl, the magnitudes of the ions’ charges are ${\sim 0.8\, e}$ [Jennison and Kunz, 1976; Bao et al., 2018; Kvashnin et al., 2019; Li et al., 2007].

The traditional concept of a covalent bond is an example of a failing of the integer-occupied-orbital model of bonding that cannot be fixed by allowing the orbitals to have fractional occupancies. Covalency is often regarded as a qualitatively-distinct type of bond, or mechanism of bonding; and a covalent bond is often described as the attraction of a pair of atoms to electrons that they share between them, and which occupy orbitals that are hybrids of orbitals from both atoms. However, if a pair of atoms shared electrons between them, the electron density would have a local maximum between them; but it never does: Maxima of the electron density coincide with maxima of the electric potential [Ayers and Parr, 2003], which coincide with the positions of the nuclei.

Both of these misconceptions have been perpetuated by the widespread use of approximations that are based on simplifying the mathematical form of the many-electron wavefunction (${\Psi}$) to make calculations tractable. For example, the Hartree-Fock approximation, which is probably the simplest useful mean-field approximation, is based on restricting ${\Psi}$ to be a Slater determinant. When $\Psi$ has this form, integer-occupied single particle states appear to have clear physical meanings. When the single particle states in the determinant are linear combinations of atom-centered basis functions, such as the orbitals in the Hartree-Fock wavefunctions of isolated atoms, each basis function ‘belongs’ to one of the atoms. Therefore, there appears to be a clear and meaningful qualitative distinction between a covalent bond and an ionic bond. However this is an artefact of $\Psi$’s simplified form.

The primary reason for approximating $\Psi$ as a determinant, or as a sum of few determinants, is not to simplify bonding conceptually, but to make calculations tractable. Therefore it seems valid to question whether localized atomic or molecular orbitals help to simplify bonding conceptually, or whether they complexify and obfuscate it. The artificial qualitative distinction between covalent bonding and ionic bonding illustrates that, at least when building the most basic understanding of bonding from the most computationally tractable form of wavefunction, they can be misleading.

Although most research scientists understand this, and also understand that the terms ionic, covalent, and metallic refer to varying degrees of localization of the electron density around nuclei, students are still being taught more traditional and misleading ideas [Bacskay et al., 1997; Zürcher, 2018; Grundmann, 2016; McQuarrie et al., 2011; Sutton, 2024; Owen, 2014]. Therefore the purpose of this subsection is to emphasize that, in some ways and for some purposes, the essence of bonding is simpler than it appears from descriptions of it in terms of localized orbitals.

Within rigorous theory, there appears to be only two ‘types’ of bonding, which are really two opposing idealized limits: One is the limit in which all of the electron density is strongly localized around nuclei by the electrons’ Coulomb attraction to them, and the other is the limit in which some of the electron density is localized around nuclei, and the rest of it is uniformly distributed. The former is the ionic limit, the latter is the metallic limit, and there is no theoretical or empirical reason to believe that covalent bonding is anything other than bonding that does not conform closely to either limit.

14.3.1 Summary of chemical bonding in terms of electron density

The electron density ($n$) is high where the microscopic electric potential from the nuclei ($\phinuc$) is high, and it only has maxima at the positions of the nuclei. If two nuclei shared electrons between them, there would be a local maximum of the density between them, but there never is.

The shape of $n$ is determined by the shape of $\phinuc$; and $\phinuc$ is determined by the charges and positions of the nuclei. Localizing density where $\phinuc$ is high makes the potential energy of attraction between electrons and nuclei more negative, but delocalizing density makes both the electrons’ kinetic energy and their mutual repulsion less positive. The ground state density is the lowest-energy compromise between these localizing and delocalizing influences.

Most of the density is localized around nuclei, and most of it is at points where its gradient ${\grad n}$ is directed towards the nearest nucleus.

Ionic bonding:

The bonding for which a superposition of spherically symmetric electron densities most closely approximates the true electron density is referred to as ionic bonding; and the ionic limit of bonding is the limit in which an approximation of this form becomes exact.

If the time average of the net charge of each nucleus and its almost-spherical electron cloud was zero, the attraction between atoms would be very weak and arise from electron correlation, rather than electrostatics (see Appendix I.5). Atoms bind chemically by becoming ions, thereby lowering the potential energy via their mutual attraction. However they do not become ions by donating or accepting electrons, but by donating or accepting electron density. When atoms are close enough to one another to bond chemically, and on the shortest time scales relevant to atomic motion ($\sim 10^{-15}$ seconds), there is no theoretical reason why ions’ average charges are likely to be integers or close to integers.

Covalent and metallic bonding:

If more of the electron density is in regions where ${\grad n}$ is not directed towards the nearest nucleus, we describe the bonding as either covalent or metallic. Bonding is metallic if the density in these regions is so delocalized that a significant fraction of the electrons are mobile. Otherwise we refer to it as covalent, for historical reasons. There is not a clear boundary between ionic and covalent bonding.

Large atomic numbers, delocalized electrons, and high coordination numbers go hand in hand because, when an atom’s radius is large, the interactions of its nucleus with electrons on the outskirts of its electron cloud are weak, and comparable in magnitude to its interactions with electrons on the outskirts of neighbouring atoms. Therefore energy is lowered by atoms arranging such that each one has many neighbours, whose electrons its nucleus interacts with. The energy is lowered further by the electron density on the outskirts of atoms delocalizing, so that more electrons have interactions of comparable strengths with multiple nuclei. This delocalization worsens the approximation of the density as a superposition of spherical densities.

The metallic limit is the limit in which the electron density becomes a superposition of spherically-symmetric densities and a uniform density.