Polygonal billiards

Polygonal billiards are easily-described dynamical systems, which are defined by the trajectories of a single particle in a polygonal region P of the plane by requiring the particle move in straight lines at constant velocity in the interior of P, and reflect off the boundary according to the familiar (“optical”) laws of reflection: the angle of incidence should equal to the angle of reflection.

(What happens if the particle hits a corner is undefined, but the set of trajectories for which this happens is vanishingly small, i.e. has zero measure.)

We can ask the usual questions that dynamicists do when they probe the behavior of their systems: what do the orbits of this system look like? Are they periodic / closed, or dense? How does that depend on the initial data (i.e. position and direction)? What does the generic orbit look like? How many closed orbits are there?  Is the system ergodic? Mixing? If so, at what rate? How do the answers to all of these questions depend on P?

The great realm of irrational ignorance

Answering these questions can get fiendishly tricky surprisingly fast. For instance, very little is known in the case where the polygon P has angles which are not rational multiples of \pi.

Even more narrowly, and somewhat astonishingly, it is still an open question to determine if billiards on a general triangular P has a closed orbit. When P is acute, the triangular path formed by the line segments between the feet of the altitudes can be shown to be a billiards path (as demonstrated by Fagnano more than 240 years ago); when P is a right-angled there is a similarly explicit construction. What happens when P is obtuse is, still, anybody’s guess—although Rich Schwartz has at least partial, computed-assisted results.

Rational enlightenment via translation surfaces

By contrast, when the angles of P are all rational multiples of \pi, there is great deal that can be said, using tools from such varied fields as Riemann surfaces, Teichmüller theory, hyperbolic geometry, and even algebraic geometry.

What allows us to start applying all of these diverse tools is a relatively simple device, the translation surface associated to the billiard system, which might be seen as akin to development maps for (G,X)-structures—the rough idea in both cases being to unroll transitions until we see everything at once on a single object, or, in slightly more technical language, to globalize the coordinate charts.

We can define translation surfaces more generally, as maximal atlases of coordinate charts into the Euclidean plane \mathbb{R}^2, with finitely many cone singularities and where the transition maps between coordinate charts are translations in \mathbb{R}^2. Note that we can obtain the genus from information about the cone singularities via (a discrete version of) Gauss-Bonnet.

We may equivalently define translation surfaces—although to show the equivalence takes a little work—as surfaces built from some finite collection of polygons embedded in the Euclidean plane with a distinguished direction (and so inheriting an Euclidean metric and a distinguished direction—by gluing maps between their sides which are translations.

For a translation surface coming from a billiard table, we may take of the polygons to be congruent. In this sense these translation surfaces have additional symmetries that we should not expect a generic translation surface to have, and will likely be rather special in this sense (although there might conceivably be some sort of rigidity result lurking somewhere.)

Closed orbits: Masur’s theorem

The more general point of view of translation surfaces allows us more freedom of argument in proving things about our original, more restricted context.

For instance: a theorem of Masur states that periodic orbits exist on every translation surface; in fact, there are infinitely many. In fact, he proves even more: the number of periodic orbits of length N grows quadratically in N. If our translation surface did in fact come from a polygonal billiard system, these periodic orbits project down to closed trajectories.

The proof uses the idea of “changing the translation structure while preserving the affine structure”, as J. Smillie’s survey describes it:

Closed orbits can be detected geometrically via the presence of cylinders—subsets of our translation surface isometric to S^1 \times I. If we fix the genus, area, and singularity data for our surface, the translation surface can only have large diameter if it contains a (long) cylinder. In particular, if our translation surface had area greater than some universal constant D, we have our cylinder, and hence a closed orbit.

Otherwise, we observe that we have some freedom to change the translation structure, i.e. the coordinate charts and translations involved in the transitions—without changing the resulting affine geometry on the translation surface. Such a change takes cylinders to cylinders, but changes the metric geometry, and in particular the diameter. We can argue more carefully to find that there is always some such change of translation structure which takes the diameter to or beyond our universal constant D; and so the general case in fact reduces to our earlier, easier one.

(With a lot more care, we may also obtain the quadratic asymptotics stated above.)

In effect, the translation surface point of view tells us that polygonal billiards on families of polygons which produce the same translation surface have similar dynamical / geometric behavior—something which was not at all obvious just looking at the polygons themselves.

Ergodicity of the billiard

A dynamical system (X,T,\mu) is said to be ergodic (w.r.t. the given measure \mu) if any T-invariant subset of X has either zero or full measure w.r.t. \mu. Ergodic systems are, in some precise sense via the ergodic decomposition, the building blocks of all dynamical systems.

An example of an ergodic system is given by polygonal billiards on a rectangle, which is in fact equivalent to the geodesic flow on a flat torus. In fact, the flow / billiard trajectories in almost every direction are ergodic, and moreover equidistribute (this can then be shown to imply unique ergodicity—see below); the non-ergodic directions are precisely those with rational slopes, and are all periodic.

This especially nice state of affairs is a prototypical example of Veech dichotomy, which states that every direction for the constant-slope flow is either uniquely ergodic, or periodic. The former often form a small subset, but what “often” and “small” here entail precisely is still not entirely pinned down.

It follows from classical results that integrable polygons (whose corresponding translation surfaces are tori) satisfy Veech dichotomy. Gutkin proved that “almost-integrable” polygons also satisfy the dichotomy. Veech, in 1989, that all lattice surfaces (now also called Veech surfaces) satisfy the dicohotomy. Here, a lattice surface is one whose group of orientation-preserving affine automorphisms, or rather the image thereof under taking the derivative, forms a lattice in \mathrm{SL}(2,\mathbb{R}). All the previous examples are Veech surfaces, as well as those corresponding to regular n-gons, as well as certain triangles.

This is still not the most general class of translation surface satisfying Veech dichotomy, however: Smillie and Weiss proved in 2008 that there are non-lattice surfaces satisfying Veech dichotomy.

This result of Veech, plus the difficulty of the problem in its full generality, seems to have led to a focus on studying the ergodicity of the billiard flow in a fixed direction.

Unique ergodicity and minimality

(X,T,\mu) is uniquely ergodic if \mu is the only invariant probability measure—in this case, since any invariant measure is a convex linear combination of ergodic measures, \mu is necessarily ergodic.

The Bunimovich stadium is an example of a dynamical system—a planar billiard, in fact, although not polygonal—which is known to be ergodic but not uniquely ergodic.

Since the systems we are working with are naturally equipped with both a measure and a topology, we can also ask about minimality (a topological analogue of ergodicity—a minimal system has no proper closed T-invariant subsets), and how it relates to ergodicity.

It can be proven that a flow direction is minimal if there are no saddle connections (i.e. geodesic segments starting and ending at vertices, but not passing through any vertices in their interior) in that direction. The proof starts with the observation that it is useful to consider the first return map to a transverse interval in the surface. This turns out to be an interval exchange transformation (IET.) IETs are a well-studied class of dynamical systems, and criteria for the minimality of IETs lead to criteria for the minimality of directional flows, including this particular one.

From the combined / somewhat muddled viewpoint of dynamics both measurable and topological, minimal ergodic directions are especially nice, and non-minimal directions we might attempt to deal with using some sort of induction / topological reduction; so we might hope that there aren’t any minimal non-ergodic directions to deal with. It is a straightforward corollary of Veech dichotomy that Veech surfaces (or more generally surfaces which satisfy the dichotomy) do not have such directions.

However, Masur has produced examples of minimal non-ergodic directions for the geodesic flow on a translation surface of genus 2, by considering a translation structure given by a rectangular billiards table with two slits in the interior; these can be generalized to give (uncountably many) examples of minimal non-ergodic directions on any translation surface of genus 2 or greater.

Enter the Teichmüller geodesic

However, it can be proven that set of such directions has Lebesgue measure zero; in fact it can be proved that rational billiards are uniquely ergodic in almost every direction .

The key step is a result of Masur that if (X, \omega) is a translation surface for which flow in the vertical direction is not uniquely ergodic, then the Teichmüller geodesic associated to (X, \omega) is divergent, i.e. it eventually leaves every compact set in the moduli space.

Here the Teichmüller geodesic associated to (X, \omega) is the flowline of (X,\omega) under the diagonal subgroup \left\{ \left( \begin{array}{cc} e^t \\ & e^{-t} \end{array} \right) : t \in \mathbb{R} \right\} of \mathrm{SL}(2,\mathbb{R}) (the Teichmüller geodesic flow, for it really does correspond to flowing along geodesics in Teichmüller space), projected to moduli space.

There has been further work to figure out just “how big” the set of non-ergodic directions is, e.g. in terms of Hausdorff dimension.

Aside: smooth non-polygonal billiards

From a slightly different perspective, billiard trajectories are akin to geodesic flow trajectories in the plane. With this in mind, perhaps the following result is not too surprising:

Any smooth surface in 3-space may be flattened to obtain something close to a smooth (not necessarily polygonal) billiard table in 2-space. Kourganoff showed that, under some mild hypotheses, the geodesic flow of this surface converges locally uniformly to the billiard flow. Moreover, if the billiard is dispersive (i.e. any two distinct points have neighborhoods whose orbits are eventually separated) and has finite horizon (i.e. time between collisions remains bounded), then the geodesic flow of the corresponding surface is Anosov. This result can be applied to the theory of mechanical linkages and their dynamics, to provide e.g. novel examples of simple linkages whose physical behavior is Anosov.


Groups and graphs (and computation)

Graphs in group theory

Given a group G, or more specifically a finite set S of generators for G, one can form the Cayley graph \mathrm{Cay}(G,S) by taking the vertex set to be the set of group elements, and adding an edge vw iff v = ws for some s \in S.

One can then attempt to study algebraic properties of the group by looking at properties of (one of) its Cayley graph(s). In the case of an infinite group G, one can show that any two Cayley graphs, i.e. the Cayley graphs associated to any two sets of generators, have the same coarse geometry (in the sense of quasi-isometry—see the post on geometric group theory), so that the coarse geometric properties of any Cayley graph can be considered to be intrinsic geometric properties of the group, rather than properties of any particular presentation thereof. Indeed, this is the key insight that drives geometric group theory.

One particularly striking example where such geometric methods seem to yield great insight with much less pain than directly algebraic ones is in the case of free groups and their automorphism groups, which may be thought of geometrically as fundamental groups of bouquets of circles. This is equivalent to the Cayley graph viewpoint: the Cayley graph is the universal cover of the bouquet of circles. Automorphisms of the free group then correspond to homotopy equivalences of the bouquet of circles, and arguments about the automorphism group can be conducted in terms of these homotopy equivalences, leading to such results as classifications and characterizations of these automorphisms analogous to the classification of mapping classes of surfaces.

The same arguments apply to finite groups, although in this case the graphs are finite, and so quasi-equivalent to trivial graphs (i.e. the graph with one vertex and no edges), and geometric group theory has nothing more to say. From this point of view, finite groups are, in a very precise sense, virtually trivial. I do wonder though: can the theory of finite graphs, itself a rather rich area of combinatorics, be fruitfully used to say / argue interesting things about finite groups?

Graphs also appear in group theory in the notion of graphs of groups which is fundamental to the Bass-Serre theory of groups acting on simplicial trees. Here the graph acts essentially as a book-keeping mechanism to record the structure of the group as built up from vertex and edge stabilisers; these graphs have also been described as “one-dimensional analogues of orbifolds”. Group actions on trees appear in a variety of settings, for fundamental reasons I do not yet fully understand, and so this notion takes on some importance.

Groups in graph theory

Conversely, group theory can play a role in elucidating the structure theory of graphs. For instance, group theory played a significant role in Laci Babai’s recent breakthrough on graph isomorphism. Very roughly speaking, Babai’s algorithm relies on the heuristic that any isomorphism between graphs has to preserve degree. Thus, given a highly irregular graph, a brute-force search could actually be (sufficiently) efficient, since the number of possible isomorphisms is small relative to the number of vertices.

The problem, then, is if we have highly regular graphs. One way around this is to consider an equitable partition; but one may not always have one of these either. However, Babai proved that the only obstacle to the existence of such a partition is the existence of a large Johnson graph, and then used subtle structural results, of an essentially algebraic nature, on Johnson groups and how they sit inside larger (finite) permutation groups to obtain, constructively, either an isomorphism, or a certificate of non-isomorphism (a provably empty automorphism group.)

In the language of Babai’s preprint: “We shall certify both the presence and the absence of local symmetry. … If we don’t find local symmetry, we infer global irregularity; if we do find local symmetry, we either build it up to global symmetry (find automorphisms) or use any obstacles to this attempt to infer local irregularity which we then synthesize to global irregularity. Building up symmetry means approximating the automorphism group from below; when we discover irregularity,we turn that into an effective upper bound on the automorphism group . When the lower and the upper bounds on the automorphism group meet, we have found the automorphism group and solved the isomorphism problem.”

Here the group theory plays a key role in introducing more rigidity / structure to an otherwise fairly flexible and fluid structure. The structural theory of finite groups is used in a fairly strong way: indeed, the proof relies on the classification of finite simple groups, although the strict logical dependence is only on one of its corollaries (Schreier’s statement that the outer automorphism group of a finite simple group is solvable.)

A concrete application to coding theory

Group theory and graph theory both come into play in e.g. the construction and study of codes from Cayley graphs and Schreier graphs: the algebraic structure introduced by the group-theoretic construction of these graphs is very helpful in proving key desired properties of these codes.


Projective representations of the Thurston geometries

homogeneous geometry is a pair (X, G) where X is a simply-connected Riemannian manifold and G is a maximal group of isometries of X with compact point stabilizers.

There are eight 3-dimensional homogeneous geometries, often called the Thurston geometries:

  • Euclidean 3-space \mathbb{E}^3, hyperbolic 3-space \mathbb{H}^3, and the 3-sphere S^3;
  • the product geometries S^2 \times \mathbb{R} and \mathbb{H}^2 \times \mathbb{R}; and
  • the more exotic fibered geometries \widetilde{\mathrm{SL}_2\mathbb{R}}Nil, and Sol.

The goal of this post is to show how all of these geometries can be modeled as projective geometries: more precisely, for any of these eight geometries (X, G), we can find some open domain \Omega \subset \mathbb{RP}^3 such that \Omega \cong X (possibly up to index 2), and a subgroup \Gamma \leq \mathrm{PSL}(4,\mathbb{R})= \mathrm{Aut}(\mathbb{RP}^3) such that \Gamma \cong G (again, possibly up to index 2), and \Gamma is the maximal subgroup of \mathrm{PSL}(4,\mathbb{R}) which preserves \Omega.

In other words, up to finite index and coverings, each of these geometries embeds (or, if we eliminate the preceding qualification, locally embeds) into 3-dimensional projective geometry.

I will follow the approach of Emil Molnár’s 1997 paper, which, for some reason, I seem to have great difficulty reading. Hopefully the process of writing this post / reading the article to reprocess the details for this purpose will help.

To describe the goal of this post somewhat more precisely: I wish to exhibit examples of such projective representations, or rather sketch semi-impressionistic outlines for how to obtain them. Readers in need of more (concrete, computational) details should consult Molnár paper; this post will hopefully provide a useful broad guide to the paper, but is not meant to be a substitute.

The general setup: projective space and symmetric bilinear forms

We recall the definition of \mathbb{RP}^3 as the projectivization of a 4-dimensional vector space V_4, i.e. \mathbb{RP}^3 := V_4 / \sim where we identify vectors in V_4 which are related by (real) scalar multiplication.

We also recall that \mathrm{PSL}(4,\mathbb{R}) = \mathrm{PGL}(4,\mathbb{R}) is in fact the full group of collineations in this case, since the Galois group \mathrm{Gal}(\mathbb{R} / \mathbb{Q}) is trivial.

The general game now is to consider symmetric bilinear forms \langle \cdot , \cdot \rangle on V_4, of varying signatures, and possibly preserving certain fiberings in the case of the product / fibered geometries; then we take \Omega to be some sort of invariant set w.r.t. \langle \cdot , \cdot \rangle, and \Gamma to be the orthogonal (sub)group w.r.t. \langle \cdot , \cdot \rangle (in \mathrm{PGL}(4,\mathbb{R}).)

We note that Molnár’s paper sets up a bunch of additional notation, as well as the notion of a polarity (which is equivalent to the bilinear form, via the dual vector space.) It appears that this is used to perform more explicit computations and more generally characterize projective representations of the various Thurston geometries, but are not strictly necessary for our purposes.

The non-fibered geometries

Spherical geometry can be represented by taking a (positive- or negative-) definite bilinear form. Then the isotropy group \Gamma to be [a projectivization of] the orthogonal group \mathrm{O}(4), and we may take \Omega to be all of \mathbb{RP}^3, or even better all of its better cover S^3.

Hyperbolic geometry can be represented by considering the hyperboloid model: take a Lorentz form, i.e. a form \langle \cdot, \cdot \rangle with signature (-+++), let \Omega be the projectivization of \{p \in V_4 : \langle p, p \rangle < 0\}; then the isotropy group is isomorphic to (the appropriate finite-index subgroup of) \mathrm{SO}(1,3).

(3-dimensional) Euclidean geometry may be represented by considering any affine chart, and the corresponding affine group sitting inside \mathrm{O}(4); the bilinear form in this case is degenerate, with signature (0+++).

The need for additional constraints

For \widetilde{\mathrm{SL}_2\mathbb{R}}, we start with a bilinear form of signature (–++), but observe (after some work) that the point stabilizers under the orthogonal group are not compact.

We would now like to impose additional structure on our bilinear form, coming from the structure of our homogeneous geometry \widetilde{\mathrm{SL}_2\mathbb{R}}, which will whittle down the orthogonal group to a more manageable size.

This additional structure comes in the form of a non-trivial line bundle structure on the unit sphere \Omega for our bilinear form, with base space a hyperboloid \mathcal{H} and fibres given the skew lines given by the S-orbits of \Omega, where S = \left\{ \left( \begin{array}{cc} R_\theta \\ & R_{-\theta} \end{array} \right) : \theta \in \mathbb{R} \right\} \cong \mathrm{SO}(2) (R_\theta being the rotation matrix \left( \begin{array}{cc} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{array} \right) \in \mathrm{SO}(2)) is what Molnár calls the screw collineation group.

We observe that \tilde{S} \cong \mathbb{R} acts freely on the base space \mathcal{H}, and that we may describe the unit sphere \Omega as the universal cover of the unit tangent bundle T^1\mathcal{H}.

The collineation group \Gamma we seek is then given by elements of our orthogonal group which also preserve our line bundle; Molnár has a very concrete description of this group, in suitable (standard) bases, as

\left\{ \left( \begin{array}{cc} \cosh r R_\psi & \sinh r R_\alpha \\ \sinh r R^*_{\alpha-\omega} & \cosh r R^*_{\psi+\omega} \end{array} \right), \left( \begin{array}{cc} \cosh r R^*_\psi & \sinh r R^*_\alpha \\ \sinh r R_{\alpha-\omega} & \cosh r R_{\psi+\omega} \end{array} \right) : \psi, \omega, \alpha \in \mathbb{R},  r \geq 0 \right\}

where R_\theta is the rotation matrix \left( \begin{array}{cc} \cos\theta & \sin\theta \\ -\sin\theta & \cos\theta \end{array} \right) and R^*_\theta is the twisted rotation matrix \left( \begin{array}{cc} \cos\theta & \sin\theta \\ \sin\theta & -\cos\theta \end{array} \right)

The product geometries

We adapt this approach, in an intuitively rather more straightforward manner, for the product geometries S^2 \times \mathbb{R} and \mathbb{H}^2 \times \mathbb{R}.

For S^2 \times \mathbb{R} we again consider a bilinear form with signature (0+++), but now take as \Omega no longer an entire affine chart on which the form is non-degenerate, but rather a punctured affine chart \mathbb{A}^3 \setminus \{O\}, where O is a fixed origin, and also equip the resulting punctured affine space with a line bundle—or really, a product structure—, with base space homeomorphic to the 2-sphere, and fibres given by open half-rays pointing outwards from O.

This gives us a distinguished family of 2-spheres (those with centre O); the group \Gamma is then, as before, the subgroup of the orthogonal group preserving this line bundle structure. It is generated by the \mathrm{SO}(2)-subgroup of isometries of the unit 2-sphere, dilatations in the \mathrm{R} direction, and inversions in our distinguished family of 2-spheres.

Analogously, for \mathbb{H}^2 \times \mathbb{R}, we consider a bilinear form with signature (0-++), and let \Omega be the subspace (in the topological, not vector space sense) of a maximal affine chart on which the form is non-degenerate given by a punctured hyperboloid cone \{p \in \mathbb{A}^3 : \langle p, p \rangle < 0\} \setminus \{O\} where again O is a fixed origin in the affine chart. Analogous to the previous case, equip \Omega with a line bundle (product structure), with base space homeomorphic to the hyperboloid, and fibres given by open half-rays in pointing outwards from O. 

We may then obtain \Gamma as the subgroup of the orthogonal group preserving the product structure; it is generated, analogously to the previous case, by the \mathrm{SO}(1,2)-subgroup of isometries of the hyperbolic 2-space, dilatations in the \mathrm{R} direction, and inversions in the distinguished family of hyperboloids coming from the line bundle structure.

Sol and Nil

For Sol, we again start with a bilinear form with signature (0-++). Now (and for Nil) we do not have any (Cartesian) product structure; we take as \Omega the entirety of an affine chart, corresponding to a maximal subspace on which our form is non-degenerate, and attempt to impose additional structure on this \Omega.

For Sol this additional structure comes in the form of a (trivial?) parallel plane fibering; the collineation group is then generated by a plane reflection, and a half-turn, and the elements \left\{ \left( \begin{array}{cccc} 1 & a & b & c \\ 0 & e^c & 0 & 0 \\ 0 & 0 & e^{-c} & 0 \\ 0 & 0 & 0 & 1 \end{array} \right): a, b, c \in \mathbb{R} \right\}.

For Nil we start with a bilinear form of signature (000+), and take as \Omega an affine chart corresponding to a maximal affine chart on which our form is degenerate. There is an additional structure on this affine 3-space which takes the form of a line bundle. To describe this line bundle, we recall the description of Nil as a Heisenberg group; the base spaces of this fibration are then the horizontal levels \left\{ \left( \begin{array}{ccc} 1 & a & c \\ & 1 & b \\ & & 1 \end{array} \right) : a, b \in \mathbb{R} \right\} of constant height c, and the fibres the “vertical” lines \left\{ \left( \begin{array}{ccc} 1 & a & c \\ & 1 & b \\ & & 1 \end{array} \right) : c \in \mathbb{R} \right\}.

\Gamma is then given by (a suitable representation of) the Heisenberg group extended by pointwise stabilizers (isomorphic to a finite extension of \mathrm{SO}(2)); Molnár describes it explicitly as

\left\{ \left( \begin{array}{cccc} 1 & x & y & z \\ & \cos\omega & \sin\omega & x \sin\omega \\ & \mp\sin\omega & \pm\cos\omega & \pm x\cos\omega \\ & & & \pm 1 \end{array}\right) : (x,y,z) \in \mathbb{R}^3 \setminus \{(0,0,0)\}, \omega \in \mathbb{R} \right\}.

(in a suitably-chosen basis.)