16. A potential paradox
https://doi.org/10.48550/arXiv.2403.13981
In this section I highlight some subtleties in the meaning and definition of the microscopic potential $\phi$ and its relationships with its macroscopic counterpart ${\bphi}$ and the microscopic charge density $\rho$. As mentioned at the beginning of Sec. 15, the value of the MIP is believed to be positive [Sanchez and Ochando, 1985; Wilson et al., 1987; Wilson et al., 1988; Wilson et al., 1989; Pratt, 1992; Sokhan and Tildesley; Leung, 2010; Kathmann et al., 2011; Cendagorta and Ichiye, 2015; Blumenthal et al., 2017; Hörmann et al., 2019; Yesibolati et al., 2020; Madsen et al., 2021; Kathmann, 2021]. This contradicts my finding that it is zero. Therefore, to illustrate the subtleties, I use the example of Hans Bethe’s 1928 derivation of an approximate expression for the MIP, which is sometimes known as the Bethe potential (${\bphibethe}$), from several different perspectives. I begin, in Sec. 16.1, by outlining his derivation and line of reasoning.
My focus is on the ‘paradox’ referred to in the section title and I do not address the question(s) of most relevance to those using the Bethe potential, or one of its descendants, as a parameter in the analysis and/or interpretation of their calculations (e.g., theoretical electrochemistry) or experiments (e.g., electron holography). In most of these applications $\bphibethe$ is used as an approximation to the average potential experienced by an electron as it passes through the material. This quantity is likely to depend heavily on the electron’s energy as it enters the material and the time that it spends inside the material. Furthermore, one should not calculate it from the probability density ${n(\rvec)}$ of an electron being at ${\rvec}$, but on the conditional probability density ${n_c(\rvec_1|\rvec_2)}$ of there being an electron at ${\rvec_1}$ given that the probe electron is at ${\rvec_2}$.
As an illustration of the importance of basing calculations on ${n_c}$ rather than $n$, consider the example of a neutral atom meeting a stray electron in a vacuum. One might deduce from the atom’s electron density ${n(\rvec)}$ that they would not be attracted to one another; but by considering how the distribution of the atom’s electrons is changed by their interaction with the stray electron, one can quickly deduce that they do attract one another and that all singly-charged anions are stable in vacuum when the electrons’ temperature is sufficiently low.
16.1 Mean inner potential (Bethe's fallacy)
Bethe assumed that the charge densities of materials are not too dissimilar from a superposition of atomic charge densities. For a crystal with one spherically-symmetric atom in its primitive unit cell $\unitcell$, the expression he derived isBethe reasoned that the average potential in the crystal is the potential emanating from one atom, integrated over all points in space, and divided by the volume per atom, i.e., the ${R\to \infty}$ limit of
Bethe expressed each atom’s charge density (charge per unit volume at a distance $r$ from its center) as ${\rho(r)=\rhop(r)+\rhom(r)}$, where ${\rhop}$ is the density of nuclear charge and ${\rhom(r) = -e \, n(r)}$ is the density of electron charge. He expressed ${\rhop}$ as a delta distribution, i.e.,
If, instead, we assume that ${\rhop}$ has a finite width and denote the total nuclear charge by ${Q_+}$, we can use the atom’s overall charge neutrality, i.e.,
I will now show, by illustration, why this obviously-right result must be wrong. Then I will explain the flaw in Bethe’s reasoning and show that a more careful treatment of the problem leads to the conclusion that ${\bphi}$ is zero. I illustrate the flaw from several different perspectives to highlight some of the many pitfalls that exist when working with the electric potential. Readers who have already spotted the flaw, or who don’t like whodunnits, might want to skip the illustrations and proceed directly to Sec. 16.2.
16.1.1 Existence of a flaw - Illustration 1
Bethe chose to build his material from a spherically-symmetric charge density, with a localized distribution of positive charge at its center and relatively delocalized distribution of negative charge surrounding it. However, just as there is no ‘right’ way to partition the charge density of a material for the purpose of defining its average dipole moment density (see Sec. 5.0.1), there is no right way to partition it for the purpose of calculating its average potential. It is no less justified to build a crystal from a superposition of charge densities of the formAn example of a building block of this form is shown schematically for a 2-d crystal in Fig. 16. It has negative charge at its center, positive charge further away, and it is charge neutral overall, meaning that the net flux of the electric field through any surface that encloses it is zero, as it is for an atom. The flux from an atom is zero at all points on a surface enclosing it, whereas the flux from the charge density of Eq. (94) is finite almost everywhere on a surface enclosing it, but with regions of the surface where it is positive and regions where it is negative. Nevertheless, Gauss’s law implies that the net potential outside the surface from charge within it is zero, as it is for an atom.
Using the same physical reasoning with which Bethe deduced that ${\bphibethe>0}$ for a material built from atoms, it can be shown that ${\bphibethe<0}$ in a material built from this charge distribution, because ${s_+=A > s_-}$. Furthermore, the magnitude of ${\bphibethe}$ depends on the value of $A$. For example, consider a simple cubic crystal with lattice spacing ${a}$ and let us build it from Eq. (94) with the choice ${A=a}$ (${\implies N_A=6}$). I will refer to the crystal built in this way as Crystal 2, I will refer to the crystal built by Bethe from atoms as Crystal 1, and I will denote their MIPs, as derived using Bethe’s approach, by ${\bphibethe_2}$ and ${\bphibethe_1}$, respectively. Then if, following Bethe, we assume the nuclei to be localized at a point, we find that
Crystal 1 and Crystal 2 are identical in the bulk; they differ only near surfaces. However, because all surfaces of each crystal are charge-neutral (${\bsigma=0}$), and because ${\Rho=0}$ in the bulk in each case, the macroscale theory would not be internally consistent if ${\bphi>0}$ in one case and ${\bphi<0}$ in the other. If the value of ${\bphi}$ is defined it must be the same in each case because the macrostructures of the two crystals are identical: electrostatically, each one is indistinguishable from empty space.
Crystal 1 and Crystal 2 could be made identical by adding a pair of charges of opposite signs, of magnitudes ${Q_+/6}$, and separated by a distance $a$, to each surface unit cell of one of the crystals. For example, to make Crystal 2 into Crystal 1 we would have to add a charge of ${Q_+/6}$ to the center of each of the electron clouds closest to its surface and a charge of ${-Q_+/6}$ at a displacement ${a\,\hat{n}}$ from the first charge, where ${\hat{n}}$ is the surface’s outward unit normal. Let us temporarily assume that, from the perspective of a point whose depth below the surface is much greater than $a$, this is equivalent to adding to the surface an approximately-uniform areal density of dipole moments,
Unfortunately, although we have corrected the difference between the two values of ${\bphibethe}$, this does not solve our problem. We still do not have any reason to prefer one building block over another; therefore we do not have any reason to prefer one of the resulting crystal surface structures over the other. We have two derivations, which appear equally valid, and from which we deduce two different values of the MIP. This appears to imply that there is a flaw in the construction that Bethe used for his derivation.
Bethe did not involve surfaces in his derivation because, when ${\bsigma=0}$, he regarded the MIP as a property of the bulk. However, the fact that ${\bphibethe_1}$ and ${\bphibethe_2}$ differ suggests that the MIP is a surface property, which can be changed by adding or removing equal amounts of positive and negative charge at each surface. This is problematic if we wish to identify ${\bphibethe}$ as the macroscopic potential ${\bphi}$ because, as mentioned above, the addition of a layer of microscopic dipoles to a surface should not change ${\bphi}$ in an internally-consistent linear macroscale theory. This is because, at the macroscale, a microscopic distance is equivalent to ($\Lequiv$) a distance of zero (see Sec. 8 and Sec. 9), and because a layer of microscopic dipoles, ${qa\hat{n}}$, is equivalent to two layers with equal and opposite charges per unit area that are separated by a distance $a$. Since ${a\Lequiv 0}$ these two layers are equivalent, at the macroscale, to a single charge-neutral layer, which would not change ${\bphi}$ inside the crystal. Therefore if Bethe’s derivation is right, and if ${\Rho}$ is a linear spatial average of $\rho$, the MIP cannot be identified as the macroscopic potential because that would be tantamount to saying that the same macroscale distribution of charge can give rise to different values of ${\bphi}$. This would imply that, even when $\Rho$ and the macroscale boundary conditions are known, the value of $\bphi$ cannot be calculated; its value depends, in some way, on certain microscopic details of $\rho$ that are lost by the ${\rho\mapsto\Rho}$ homogenization transformation.
If we can assume that ${\phi}$ is a linear functional ${\phi[\rho]}$ of $\rho$, and that $\Rho$ is a linear spatial average $\expval{\rho}$ of $\rho$, then the linearity of both operations implies that ${\phi[\Rho]= \phi[\expval{\rho}]=\expval{\phi[\rho]}}$. Therefore, if there exist two microscopic charge densities, $\rho$ and ${\rho+\Delta\rho}$ , with the same macroscopic charge density ${\Rho}$ (${\implies\expval{\Delta\rho}=0}$) and different macroscopic potentials, $\bphi$ and ${\bphi+\Dbphi}$, then ${\bphi}$ does not equal ${\phi[\Rho]}$ and is a nonlinear functional ${\bphi[\rho]}$ of $\rho$. Linearity would imply that
Returning to the example of Crystal 1 and Crystal 2: if we rigidly shift the MIP of Crystal 2 by ${\Delta \phi>0}$ by coating its surfaces with a layer of dipoles, the same layer would shift the average potential in the vacuum just outside the crystal by ${-\Delta \phi}$. Outside Crystal 1, $\phi$ appears to be zero because the field from each atom is zero. Outside Crystal 2, the average of $\phi$ appears to be zero because the average electric field emanating from each building block is zero. Adding a dipole layer to Crystal 2 to turn it into Crystal 1 appears to shift the mean vacuum potential (MVP) in a layer surrounding the crystal up, while shifting $\bphibethe$ down by the same amount. In the limit of large distance from the crystal the potential vanishes, because the crystal is charge neutral overall, but it does not begin to decrease in magnitude significantly until the distance to the closest surface is comparable to one or more of the surface’s linear dimensions; therefore the MVP is shifted by ${-\Delta \phi}$ in a macroscopic layer of vacuum surrounding the crystal. So if Bethe’s derivation was correct, and if a macroscopic layer of microscopic dipoles could shift the average potential in a macroscopic region, the MVP would be zero in a macroscopic layer of vacuum surrounding the crystal both before and after it had been shifted by a finite amount ${-\Delta \phi}$! Clearly, this is absurd.
Now consider Fig. 17, which uses the concept of a dipole layer to illustrate one argument for why the MIP is positive. The crystal in question is identical to Crystal 1, so this construction appears to validate Bethe’s result. However, there is no justification for dividing the material’s microstructure into the blue and pink layers. If, for example, we combined each adjacent pair of pink and blue layers into a single charge-neutral layer, we would find that the MIP vanishes. Therefore, as with the construction Bethe used in his derivation, two equally-justified ways to partition and spatially-average the microstructure leads to two different values of the MIP.
The superposition principle, on which Bethe’s derivation and much of electromagnetic theory are based, allows us to do the following: Let us partition the space $\materialregion$ occupied by an electron cloud into $M$ partitions of volume ${\materialregionvolume/M}$ and let us divide the nucleus’s charge into $M$ ‘pieces’, such that for each partition there is a piece of nucleus with the same magnitude of charge. Now, after taking the large-$M$ limit, let us displace the pieces to the partitions so that each one becomes charge neutral. The atom’s spherical symmetry initially ensures that, after displacing all of the pieces, it is again spherically symmetric. After this redistribution of charge, the MIP must be zero because the nucleus and every partition have become charge neutral. If we view the displacement of nuclear charge as the superposition of the negative of an atom’s charge density on each atom, this makes sense. We have simply superimposed a crystal’s charge density and its negative, so of course the MIP of the superposition vanishes. However, we could also view the displacement of each piece of nucleus as the placement of a negative charge at the nucleus and a positive charge in the partition. Placing a dipole at a point in space changes the potential everywhere but, by symmetry, it cannot change the spatial average of the potential. Therefore placing all of these dipoles inside the crystal should not change the MIP. It appears that the superposition principle does not apply.
16.1.2 Existence of a flaw - Illustration 2
Another way to see that there must be problem with Bethe’s result is to treat electrons as point particles instead of expressing ${\rhom}$ as a smooth and delocalized density. Using a line of reasoning similar to Bethe’s we could say that the total potential in each unit cell from all electrons and nuclei outside the cell is approximately equal to the sum of the total potentials emanating from the particles inside the cell. Then we could calculate the total potential from each point particle at all points within a distance $R$ of it, add together the total potentials from all particles within each unit cell of the crystal, and take the limit of large $R$ to get the total potential emanating from each unit cell. This total would vanish because the potential emanating from a point charge ${Ze}$ is the negative of ${Z}$ times the total potential emanating from a point charge ${-e}$. Cancellation is obvious when ${Z=1}$ (hydrogen), but Bethe’s construction does not exclude this case, either explicitly or implicitly. Therefore his derivation leads to the conclusion that the magnitude of the spatial average of the potential from a proton is greater than the magnitude of the spatial average of the potential from an electron.This suggests that the problem in Bethe’s derivation might be related to his use of a continuous charge density for electrons and a (discrete) delta distribution of charge for nuclei. Usually this form of $\rho$ is regarded as a time average of the true charge distribution: electrons are whizzing around the more massive nuclei so fast that their charge, when observed on a timescale of about ${10^{-16}-10^{-15}}$ seconds, appears to be smeared into a continuous charge density. This timescale is too short for nuclei to move significantly, but long enough for each electron to trace out a very long trajectory. Nevertheless, because the integral of
16.2 Flaw in Bethe's reasoning
The flaw in Bethe’s derivation is that he calculated the electrons’ contribution to the potential from a volumetric density of negative charge, ${\rhom(\rvec) = -e\,n(\rvec)}$, which is defined at all points in space. Then, because the electron density is spread over a greater volume than the nuclear density on femtosecond time scales, the spatial average of the potential does not vanish.To understand why it does not vanish, let us again consider the potential ${\phi_r(r;\eta)}$ at a distance $r$ from the center of a spherically-symmetric nonpositive or nonnegative charge density ${\rho(u;\eta)}$, where the width of ${\rho}$ is proportional to the value of $\eta$, which is a parameter. Let us denote the integral of $\rho$ within a sphere of radius $r$ by ${Q(r;\eta)}$ and its integral over all space by
Now let us consider the average of $\phi_r$ over all points within a fixed distance $R$ of the center of $\rho$ as the value of $\eta$ changes. When ${\eta/R}$ is very small, the magnitude of $\phi_r$ at almost all points is approximately ${\kappa \abs{Q_\infty}/r}$ and it is only significantly smaller than that value in a volume fraction ${\sim \left(\eta/R\right)^3}$ of the sphere. In the limit ${\eta/R\to 0}$, the average potential is the same as it would be if $\rho$ was the delta distribution of a point charge. However, as $\eta$ increases, the magnitude of the average potential in the sphere of radius $R$ decreases because the fraction of the volume occupied by points at which $\abs{\phi_r}$ is significantly less than ${\kappa \abs{Q_\infty}/r}$ increases. Therefore, the average potential in the sphere reduces in magnitude as $\eta$ increases.
This is why the potential from the electrons does not cancel the potential from the nuclei in Bethe’s derivation: the value of ${\eta}$ is finite for the electrons, but vanishingly small for nuclei, which makes the magnitude of the average potential from the nuclei greater than that from the electron cloud.
I will now explain, from three different perspectives, what is wrong with Bethe’s derivation and with his use of the charge density in Eq. (91).
16.2.1 Perspective 1
Bethe’s use of a continuous electron charge density ${\rhom(\rvec)=-e\,n(\rvec)}$ suggests that he interpreted it as the time average of the electrons’ instantaneous delta distribution of charge. However ${n(\rvec)}$ is not a time average of the electrons’ positions, it is a probability density that an electron (any one of them) is at position $\rvec$ at any precisely-specified time. The average, over a time interval ${\interval(t,\Delta t)}$, of the delta distribution of a set of moving particles is not a volumetric charge density, but a set of linear charge densities defined only along the segments of the trajectories followed during ${\interval(t,\Delta t)}$. Therefore the time average of the electrons’ delta distribution is a set of linear charge densities defined on a set of curves and there is no charge at points that do not lie along these curves.This means that the set of points at which the time-average of the electrons’ charge distribution is nonzero is a set whose measure in $\realthree$ is zero, regardless of the magnitude of ${\Delta t}$; therefore electrons are not more delocalized than nuclei because both occupy zero volume. Using the fact that ${Ze/r}$ is cancelled by ${Z\times -e/r}$, it is easy to show that the potential from the true time average of the electrons’ charge distribution exactly cancels the potential from the nuclei.
16.2.2 Perspective 2
Although Eq. (95) appears to be the spatial average of the potential from a set of point nuclei and a continuous density of negative charge ${\rho_-(\rvec)=-e\,n(\rvec)}$, it is not. The quantityHowever this reasoning does not apply to a continuous charge density because the magnitude of the charge enclosed by the surface scales like ${\left(\Delta r\right)^3}$ in the small ${\Delta r}$ limit. A correct application of Gauss’s law for a spherically-symmetric charge density leads to Eq. (89), which does not give the same result as Eq. (96). Eq. (96) is the correct expression for the expectation value of the potential at ${\uvec}$ from the electrons, but it is not the correct expression for the potential from charge density ${\rhom}$; when there is spherical symmetry, the latter is Eq. (89).
Eq. (95) can be expressed in the slightly more general form
16.2.3 Perspective 3
The relations ${\me=-\grad \phi}$ and ${\rho/\epsilon_0 = -\laplacian\phi}$ are preserved by the homogenization transformation because $\grad$, $\laplacian$, and the spatial average are all linear operations, which commute when they are applied in a mutually-consistent manner. For example, if $\expval{\;}_x$ and $\expval{\;}_y$ denote averages along the $x$-axis and $y$-axis, respectively, thenThis is why it is not physically reasonable to calculate the MIP from the average of the green curve in Fig. 17 and it is why the value of the MIP deduced by averaging the charge distribution in layers parallel to the surface depends on the choice of the layers’ positions and thicknesses. For example, if, instead of the division into the pink (P) and blue (B) layers depicted in Fig. 17, each layer was chosen to be a layer of atoms, the green curve would be flat because each layer would be charge neutral. A layer of atoms comprises two pink layers and two blue layers in the order BPPB, so I will call it a BPPB-layer. One could choose the first layer at the surface to be a negatively charged B-layer and all others to be charge-neutral PPBB-layers. In that case the green curve would be a straight line with a negative slope; therefore, not only would the MIP appear to be negative, there would be a macroscopic electric field in the material emanating from the plane of negative charge. If the first layer is a BPP-layer and subsequent layers alternate between B- and BPP-layers, the MIP appears to be negative, with the potential as a function of depth resembling a skewed version of the negative of the green curve in Fig. 17.
The electron charge density used by Bethe in his derivation can be regarded as the result of performing the following sequence of temporal and spatial averages: first, the time average of the electron charge distribution is calculated to give a set of curves carrying linear charge densities; next, this set of linear charge densities are turned into a volumetric charge density by averaging in the radial direction over a small width ${\dd{r}}$; finally, the resulting distribution is given spherical symmetry by setting the charge density at distance $r$ equal to its spatial average on the spherical surface of radius $r$. The resulting charge density is then used to calculate the potential as a function of position along the radial direction, i.e., along an axis that, at its point of intersection with the surface on which the final spatial average is taken, is perpendicular to this surface. There is no reason to expect this procedure to produce results that are any more meaningful than those derived by partitioning the surface in Fig. 17 into artificial layers using an arbitrary, unjustified, and mutually-inconsistent sequence of partial spatiotemporal averages.
I avoided the problems with Bethe’s derivation in Sec. 15 by not making any assumptions about the microscopic charge density, except that it is mathematically smooth; and by only averaging along the $x$-axis; and by using the same mesoscopic interval width for all spatial averages. Because I used a general form of $\rho$, the derivations of Sec. 15 apply to charge distributions that are arbitrarily close to delta distributions; and Coulomb’s law can be applied to a smooth charge distribution in this limit.
Comments