Symmetry via Ergodic Theory

Symmetry via Ergodic Theory

One of the attempts to quantize space without losing too much symmetry is ergodic theory. Much of my thesis belongs to this program. It is a flavor of quantum calculus, as “no limits” are involved.

The story is closely related to Jacob Feldman, one of my heroes of my graduate and postdoc time. I write this blog entry after having learned that this ergodic theorist has died earlier this year. I personally met Jacob only once at a dynamical systems conference when I was a Taussky-Todd instructor at Caltech during 1994/1995. By the way, I had been fortunate also to meet Olga Taussky Todd, the “torchbearer for matrix theory” during that time. She visited once toghether with her husband John Todd, just to meet the Taussky-Todd instructors.

In ergodic theory one can look at space as an orbit of a discrete group on a Lebesgue probability space $(X,A,m)$. A convenient special case is if the group is Abelian, free and finitely generated, that is if it is the group Zd. It deals with the situation, in which one has d commuting automorphisms T_1, \dots, T_d on X. A simple example is to let $(X,A,m)$ to be the circle with normalized Lebesgue measure m as probability measure and to take T_k(x) = {\rm mod}(x+\alpha_k), where \alpha_k are rationally independent real numbers. It is convenient to assume that the transformations Tk are all ergodic and that the action is free meaning that the probability of the event A_n=\{ x  |  T^n(x)=x \} is zero for every n=(n_1, \dots ,n_d) different from (0,0, \dots ,0). Here T^n = T_1^{n_1} \cdots T_d^{n_d} is multi-index notation, allowing to express the action of the group Z^d on X to be written as (x,n) \to T^nx.

In probability theory, one sees a Zd action also a “multi-dimensional stochastic process”, where one can look at limit theorems or growth rates of sums of random variables. In ergodic theory, one is also interested in mixing properties, spectral properties of the Koopman operators associated to the transformations. Geometry enters if one sees the maps d_kf =f(T_k)-f as partial derivatives. One can then go on and define the gradient, curl, divergence like in the case of differential forms on a manifolds. There is then also a Dirac operator D = \sum_k d_k + d_k^*, which according to the celebrated Connes formula leads to a natural distance and so to a metric geometry on the \sigma-algebra: given two events A and B, define d(A,B) as the supremum over all |f(A)-f(B)|_2, for which f is in the Hilbert space which satisfy |Df|_2 \leq 1.

Each “space” given as an orbit of a dynamical system is now labeled as an “experiment” x in a probability space X. But it is in the very nature of probability theory and especially ergodic theory that one is not interested in individual experiments but in “events” which are measurable sets A in X, sets in the \sigma-algebra of the probability space. In this way, space becomes probabilistic. This is useful. as it allows now to make statements which are true “almost certainly” that is with probability 1. It is a mathematical setup which has proven to be very successful and which is at the heart of probability theory. In spectral theory for example, one can often make statements which are true almost everywhere while statements about individual elements are inaccessible. We are not interested in individual experiments, we are interested in sets of experiments we can quantify with a numerical quantity, the probability. Probability theory smooths out things. It is important to note however that ergodic theory is much more general than what one usually looks at in probability theory. In the later one usually has stochastic processes which show strong decorelation or even Bernoulli properties. In ergodic theory, one can look also at processes which are very tame, like irrational rotations on a circle.

One of the most fascinating things is cohomology in ergodic setup. It is even simpler to define than in geometry as no “limits” are involved. Lets look at the one-dimensional case first, where we deal with one transformation T only. Let take the rotation T: x \to x+\sqrt{2} \; mod 1 on a circle T=R/Z for example. Lets call a function f in the Hilbert space L^2(X,m) a “cocycle”. Let f be a “coboundary” if f = dg= g(T)-g for some other g in the Hilbert space. The zero’th cohomology group is the quotient cocycles/coboundaries. If the coefficient group of the cohomology is not R but Z_2, we deal with functions taking values in 0 and 1. In other words, the set of cocycles is then the \sigma-algebra itself. The set of coboundaries are the events which can be written as A(T) + A, where + is the symmetric difference operation in the Boolean sigma-algebra. As in topology, the zero’th cohomology measures some kind of connectivity of the underlying space. What happens (as we see below) is that even so “space is discrete”, the cohomology is uncountable. There are many individual connectivity components.

Cohomology is relevant when studying Lyapunov exponents. This is how I slithered into the area while writing my diploma thesis under the guidance of Juergen Moser. The “senior thesis” was on the topic of the Stoermer problem. I ambitiously wanted to prove then that there is positive metric entropy, positive Lyapyunov expnoents on a set of positive measure. In order to do that, one has to look at matrix-valued random variables and multiply them along an orbit of a dynamical system (in that case the motion of an electron in the magnetic dipol field of the earth). Exponential growth of that matrix means “chaos”. How does one establish that? How robust is this notion? It turns out to be closely related to the cohomology question and quite subtle. Unlike in a commutative setting, where the random variables commute, in this non-commutative setting, the limit is not just an expectation any more. Yes, one has a theorem, Oseledec’s theorem, but this only tells that the limiting growth rates exist, but the theorem does not tell not how to compute them. The limit depends in general in a discontinuous way on the random variable. One knows that positive Lyapunov exponents implies the existence of a stable and unstable “bundle” in the phase space. In the non-uniform situation, these bundles intersect. One can therefore exchange them using a small perturbation. See this paper which is the first time that discontinuity of the Lyapunov exponent has been demonstrated (under a fixed dynamical system. In probability theory, Kiefer got discontinuities earlier). The first open question at the end asks whether the Lyapunov exponent is continuous when one replaces the metric dynamical system (automorphism of a probability space) with a topological dynamical system (automorphism of a compact metric space). This is related to the “last theorem of Mané”. It seems now that part of that question is answered: Viana and Yang have recently shown that under some expansiveness condition, there are open sets of linear cocycles which are not uniformly hyperbolic and have positive Lyapunov exponents. I proved in 1991 that in a ( much easier) measurable setting, one can perturb a cocycle in the uniform topology arbitrarily little to get zero Lyapunov exponents. The basic strategy is simple: just exchange the stable and unstable directions. This is however quite subtle and I don’t think that this difficulty has been seen before 1991: when you take a hyperbolic situation with a stable and unstable direction, you gave two copies of the probability space and the map is ergodic on both the stable as well as the unstable Ledrappier components. One would then like to destroy the hyperbolicity by just allowing to exchange the branches when hitting a small set. This in general does not kill the Lyapunov exponent however: what is needed when doing this exchange that the set on which it is done is not a coboundary! Fortunately there were enough results in ergodic theory available which allow one to prove that coboundaries are dense and discontinuity happens. By the way, this is also true in a finite setting. If T is a cyclic permutation of a finite set, then a set is a coboundary if and only if it is even. The cohomology group is in this finite case Z2. One would expect that because of this, the cohomology group is finite or countable in the general aperiodic case but (definitely a bit surprisingly), the cohomology group is uncountable in the one-dimensional case (we will come to this remark of Greg Hjorth).

In the two-dimensional case, one can look at “vector fields” F=(P,Q), where P,Q are elements in the Hilbert space. Given a function f, the gradient is (d_1 f, d_2 f) which we can just write as (f_x,f_y). Given a vector field F=(P,Q), one can look at the curl(F) =  Q_x - P_y. As in calculus, the curl of a gradient is zero. What is the first cohomology group? Which vector fields are irrotational but not gradients? It turns out that if the coefficient group is a compact Abelian group, then this first cohomology H^1(X,T,S,Z_2) is trivial. From a cohomological point of view, in dimensions 2 or higher the space is contractible. This result follows from a theorem of Feldman and Moore from 1977, but using this results requires to show the equivalence of the “deRham cohomology” and simplicial cohomology”. This equivalence has first been established in the 2-dimensional case by the probabilist Jerome de Pauw who proved this in his thesis. I generalized it to arbitrary dimension. So here is a theorem in the simplest case, where the elements in the sigma algebra of the probability space are the functions. (These events can be seen as Z2-valued cocycles). As before A+B is the symmetric difference between two events, the addition in Z2 valued cocycles.)

Corollary of Feldman Moore: Given a free ergodic Z2 action generated by two commuting ergodic measure preserving invertible transformations (T,S) on a standard probability space X. For any two events P,Q defining a vector field F=(P,Q) which is irotational curl(F) = Q(T)+Q + P(T)+P = 0, then F is a gradient field: there is an event f, such that grad(f) = F, meaning (P,Q)= \nabla f = (f(T)+f,f(S)+f), where + is the symmetric difference in the \sigma-algebra.

It is useful to compare the situation to differential topology. The equivalence of de Rham cohomology (which uses Euclidean product structures) and simplicial cohomology (which does not and relies only on a triangulation) is in topology a result of de Rham. Like Hopf, also de Rham is a favorate for mine as he produced useful but still simple and intelligable theorems. [De Rham is also “close” as he used to hike and climb in the same part of Switzerland I like. Here is a reportage following the footsteps of Jean Piaget (an other hero of mine) and de Rham. By the way, there are some nice mountaineering stories of John Milnor, where he describes also a climb of the Baltschieder Stockhorn with de Rham. (It is quite a serious climb. I did it once guided by Egon Feller who is a professional climber.)]

One can see this as a result telling that space is simply connected. There are no closed non-contractible loops in this ergodic universe. The result is however non-constructive. Even in very simple examples like when X is the circle and T(x)=mod(x+a,1) and S(x) = mod(x+b=1) and if P = [u,v], Q=[p,q] are two intervals, the event f is a measurable subset of the circle but we have no idea how it looks like. It is most likely not a countable union of intervals. The result is surprising since the zero’th cohomology group H^1(X,T,Z_2) is uncountable. Such questions were first studied in the context of representation theory like the representation of the Heisenberg group, the group of upper triangular 3×3 matrices with 1 in the diagonal. In simple cases of irrational rotations it leads to tough questions in Fourier theory. In general, even the question of cardinality is difficult and needs quite serious descriptive set theory. The following result was shown to me by Greg Hjiorth at Caltech. Greg passed away much too early in 2011. He was a student of Hugh Woodin. See the obituary published at UCLA.

Special case of an observation of Greg Hjorth which answers the question about the cardinality of the zero’th cohomology group:

If X is a Lebesgue space equipped with an ergodic aperiodic map T, call two events A,B equivalent, if A=B+f(T)+f, with a third event f.

The set of equivalence classes of events is uncountable.

The proof of Greg relies heavily on descriptive set theory a topic in which the “bible” has been written by Alex Kechris who was at that time the Mentor of Greg.

For more details, see this paper, finished in 1999 and was last updated in 2000. I had submitted that paper to the journal “ETDS” and got in 2000 the response “The reviewer was not able to understand the paper”. Yes the paper was definitely not polished, but the referee seemed have been pressed on time or was simply not interested. I still think it is an important result as it is essentially the de Rham theorem in an ergodic setup. [That time was rather turbulent for me. I had finished up my postdoc grant at UT Texas and believed in the spring of 1999 (under time pressure) to have a nice result on the entropy of the Standard map, which then did not work out. I was then in 1999 on the job market (also outside academia), where it was difficult both for visa reasons as well of course due to my lack of experience outside academia. As I had a fellowship from Switzerland (the Swiss National Science Foundation had sponsored my research at the university Texas in Austin), I only had a J1 visum which needed to be changed to a H1 visum. The burst of the dot com bubble was then already feared or predicted by folks who had insight. The bubble was still hot in 1999 but tech companies already began to be cautious about hiring, a few months later, the bubble burst. I remember having been standing in long lines for jobs. At one point I got hired by one branch of a HR department but told by an other that I can not work due to visa reasons. It was still difficult also to get a visum for teaching at Harvard. The entire family had to drive to Ciudad Juarez to apply at an US consulate for a H1 visum. This visum was then transferred after some trouble (the 911 event made the INS grind to a halt for some time) to a Green card and then, in 2010 to a citizenship.]

So, why is it nice to have a “probability space of spaces” rather than one space? The reason is symmetry. We can implement much more symmetry on a probability space. Let us assume for example that our probability space X is X=SU(2) equipped with the normalized Haar measure m. This is a nice Lebesgue space which carries a lot of measure preserving Zd actions. [ Lebesgue spaces are as a category rather boring: there is only one species. They all are equivalent. But this boredom is also strong. It is a uniqueness result which makes the object natural. On an ergodic level, we don’t have to worry about any continuity or smoothness constraints.]
We could also take SU(2) x SU(2) and get naturally to Lorentz group. While the lattice Zd does not have rotational symmetry, inside the probability space of all these lattices we have an obvious SU(2) symmetry. That SU(2) x SU(2) is related to the Lorentz group is explained well in this CERN summer course of James Wells) and [MP4]. By the way, these lectures of Wells are well done. They stress how elementary particles can be understood as representations of the Lorentz group.

While the lattice itself does not even have rotational symmetry, the probability space features these symmetries. In other words, we have both the discreteness as well as a continuum symmetry. Since the Lorentz symmetry group is implemented we have a discrete setting and still Lorentz invariance. Together Bert Hof and an other occasion with Evan Red, also at Caltech, we once explored the idea to study cellular automata and particle lattice gas particle systems as well as Vlasov type particle systems in an ergodic setup. What is nice about ergodic theory that one can study infinite, aperiodic situations with finite resources. Especially nice is the almost periodic case as it is an “integrable situation” where “almost all” often happens for “all” cases (but not always). The theory of almost periodic Schroedinger operators is a good example where one can see how the ergodic setup allows to “smooth out” difficulties which appear in deterministic settings.

Here is how to get infinite configurations: take two irrational, rationally independent numbers \alpha,\beta for example and an interval I=[a,b] on the circle. Now we get for every element \theta on the circle an initial condition (n,m) \to 1_{I}(T^n \theta). The point is that one can now study the game of life (or any other automaton) in an almost periodic setting. One has just to keep track of the intervals on the circle which describe the situation. The point is that almost all configurations have a well defined density and other thermodynamic quantities without taking limits! The Birkhoff ergodic theorem assures that the spacial limit is the same than say just the total length of the intervals. With Evan Reed, we looked at Vlasov systems or Riemannian geometry, where one has situations, where infinite particle configurations or infinite Riemannian manifolds have well defined macroscopic quantities as the total integral is a Bohr limit. It is beautiful as one does not have to assume some kind of unnatural asymptotic flatness for example.

Assumptions like asymptotic flatness in a Riemannian setting are often taken for technical reasons as one wants to quantify certain things like total curvature or mass etc. But asymptotic flatness is fundamentally as flawed as the picture of our planet being a disk. Asymptotic flatness is an “observer centric” point of view. If you have an asymptotically flat space, then with sufficiently fast decay of the energy momentum density, one can assign a center of mass. Much more natural are assumptions of having a compact space or space times with no boundary but these are speculations appearing in popular books like “a short history of time” where Hawking speculates the universe being a 4 dimensional sphere. But there is little evidence as we don’t know the fate of the universe. And as long as basic problems like the nature of “dark matter” are not resolved, we could be as clueless and naive as the painters of the mappa mundi in the middle ages:

Ebsorfer Stich, example of a Mappa Mundi, a medieval European map of the world

Looking at “ergodic boundary conditions” can be useful beyond just having the ability to compute expectations of observables in non-compact worlds. It can give an explanation of transport even in a discrete world. How can one hop from one point to an other? As I had demonstrated in my PhD thesis, this can be described this using Backlund transformations or isospectral flows: the step x to T(x) can be interpolated by an isospectral deformation. The symmetry of a system can be implemented as isospectral deformations which are deformations of the algebra. One can go from one node in the discrete space to an other by passing through the probability space of isospectral operators. This is a quantum mechancial picture. The discretness of space is only an illusion. We can as waves travel from one point to an other. Ergodic theory allows for tunneling within a space of Hamiltonians which are isospectral. In quantum mechanics, the tunneling of waves of course is much easier: It’s just linear algebra to find for an initial wave a velocity so that the wave is at a neighboring point after some time. Note that this does not allow to deform the operator from one node to the other as in quantum mechanics, the unitary evolution of the wave commutes with the operator itself. One can do this differently using an isospectral Lax deformation of the operator in which case the evolution does still commute with the Hamiltonian but not with the Dirac operator. The Dirac setup is intriguing as it merges wave mechanics for real waves with Schrödinger dynamics for complex waves: real waves have position and velocity that in quantum mechanics, waves are complex valued.

An other place where a Zd action naturally appears is in the limit of Barycentric refinement. There is no assumption on the underlying system. An almost periodic system emerges. In the one dimensional case, it is the dyadic group of integers, a compact topological space. So, even if one is interested in finite discrete spaces (as I’m now), one has to consider the Barycentric limit, which is an ergodic theoretical frame work.