The Eskin-Mirzakhani-Mohammadi Magic Wand

Structure of orbits: a geometric Ratner’s theorem?

The ergodicity of the \mathrm{SL}(2,\mathbb{R}) action on the moduli space \mathcal{H} of translation surfaces (= moduli space of Abelian differentials, under the identification we made earlier) allows us to understand generic orbits, but not of arbitrary orbits. In particular, for example, a family of flat surfaces correspond to a fixed rational polygonal billiard forms a positive-codimension subspace, about which ergodicity allows us to say nothing.

There are, however, results in dynamics / ergodic theory which classify not just almost all, but all orbits, the prototypical example being Ratner’s Theorem(s) on unipotent flows:

Theorem/s (Ratner) Let G be a connected Lie group and U be a connected subgroup generated by unipotents. Then

  • for any lattice \Gamma \subset G and any x \in G / \Gamma, the closure of the orbit Ux \in G / \Lambda is an orbit of a closed algebraic subgroup of G.
  • every ergodic invariant probability measure is homogeneous;
  • every unipotent orbit is equidistributed in its closure.

A basic example is given by a horocycle flow on a hyperbolic manifold. These are ergodic, and so we know that almost every orbit is dense; but Ratner’s theorem tells us that in fact we have a strict dichotomy: every orbit is either closed or dense.

The hope here is for a similar result: one precise formulation of this is the following

Conjecture (“Magic Wand”) The closure of a \mathrm{SL}(2,\mathbb{R})-orbit of any flat surface is a complex-algebraic suborbifold. (By a theorem of Kontsevich, any \mathrm{SL}(2,\mathbb{R})-invariant complex suborbifold is represented by an affine subspace in cohomological period coordinates.)

Aspects of Teichmüller theory

Recall that we have identified \mathcal{H} as a space of pairs (complex structure, holomorphic 1-form). Recalling some of the plumbings of Teichmüller theory, we consider also the space of pairs (complex structure, holomorphic quadratic differential), and identify it with the cotangent bundle to the moduli space \mathcal{M} of complex structures. \mathcal{H} can be identified with a subspace of \mathcal{Q} consisting of those quadratic differentials which can be represented as global squares of holomorphic 1-forms.

This subspace may be considered as a “unit cotangent bundle”, being invariant under the Teichmüller geodesic flow (i.e. the diagonal subgroup action induced by the \mathrm{SL}(2,\mathbb{R}) action on \mathcal{H}.)

We may check that \mathrm{SL}(2,\mathbb{R}) orbits in (the image of) \mathcal{H} in \mathcal{Q} descend to isometric maps of \mathrm{SL}(2,\mathbb{R}) / \mathrm{SO}(2) \cong \mathbb{H}^2 to\mathcal{M}—i.e. the projections of these orbits are Teichmüller discs, also known as complex geodesics.

Complex geodesics may be described more directly in terms of the language of flat surfaces as follows: recall \mathrm{SL}(2,\mathbb{R}) orbits in \mathcal{H} correspond to translation surfaces with a distinguished direction, encoded by the holomorphic 1-form; the \mathrm{SL}(2,\mathbb{R}) action changes the translation structure, i.e. the fundamental polygon, but not the affine structure, i.e. the resulting translation surface. Then we obtain a complex geodesic by forgetting the 1-form, i.e. forgetting the distinguished direction.

The classification of orbits is then closely related to the classification of these complex geodesics, which allows us to potentially use (even more) language and tools from Teichmüller theory.

“Revolution in genus 2”

Kontsevich-Zorich classified strata of the moduli space using spin structures and hyperellipticity. In genus 2, the two stratum \mathcal{H}(2) and \mathcal{H}(1,1) are each connected and consist entirely of hyperelliptic surfaces.

Smilie proved that closed $latex\mathrm{SL}(2,\mathbb{R})$-orbits—orbits of flat surfaces which are in some sense exceptionally symmetric—correspond exactly to orbits of Veech surfaces (see first post on polygonal billiards for a description of Veech surfaces.) The identification of closed orbits thus reduces to (or, at any rate, is equivalent to) the classification of Veech surfaces, about which some things, but not very many, are known.

McMullen proved that there is (up to ramified coverings) only one Veech surface in the stratum \mathcal{H}(1,1), given by the regular decagon with identified opposite sides.

Calta and McMullen, using different methods, described all Veech surfaces in \mathcal{H}(2)—there is a countable family even up to ramified coverings—and gave efficient algorithms to recognize and classify these.

They also describe invariant submanifolds of intermediate dimension—intermediate between the full stratum and but larger than single closed orbits.

Finally, McMullen shows, using all of this, plus more subtle tools, that our “magic wand” conjecture is true in genus 2; the classification is in fact rather more precise, and he also obtains results about invariant measures in the spirit of Ratner’s theorems.

Eskin-Mirzakhani-Mohammadi’s magic wand

Mirzakhani and collaborators (Eskin and Mohammadi), together with Filip, in spectacular (relatively) recent work, proved the “magic wand” conjecture, plus measure rigidity results, for all genera.

The measure rigidity result of Eskin-Mirzakhani states that any ergodic P-invariant measure (where P is a maximal parabolic subgroup, e.g. the Borel subgroup) is in fact a Lebesgue class measure on a manifold cut out by linear equations, and must be \mathrm{SL}(2,\mathbb{R})-invariant. This uses considerable machinery from ergodic theory: “almost 100 pages of delicate” entropy arguments, plus ideas of Benoist-Quint.

The theorem of Eskin-Mirzakhani-Mohammadi then builds on this to state that the \mathrm{SL}(2,\mathbb{R})-orbit closure of a translation surface is always a manifold. Moreover, the manifolds that occur are locally defined by linear equations in period coordinates, with real coefficients and zero constant term.

The proof proceeds, given the measure rigidity result, by constructing a P-invariant measure on every P-orbit closure. Here the use of P, as opposed to \mathrm{SL}(2,\mathbb{R}), is crucial—the former is amenable whereas the latter is not, and this allows us to use averaging methods in our construction.

(Filip’s result is needed to go from analyticity, which Eskin-Mirzakhani-Mohammadi actually gives us, to algebraicity.)

Where can the magic wand take us?

These results allow us to say things about specific families of translation surfaces—e.g. a rational billiard table, whose orbit under the \mathrm{SL}(2,\mathbb{R})-action forms a high-codimension family in \mathcal{H}—rather than just “almost all” translation surfaces

Thus, for instance, we can prove quadratic asymptotics (exact, not just lower and upper bounds as was previously the case) for the number of generalized diagonals, etc. in polygonal billiards.

There are many other instances where some problem may be naturally (re)formulated in terms of translation surfaces coming from polygonal billiards; then the magic wand implies additional structure on a relevant family of translation surfaces, which yields insight into the original problem. Below we outline two concrete examples of this:

The illumination problem

Given a room, how many light-bulbs are required to light it? Or, to abstract the problem a little: given a polygonal domain P (or really any planar domain, but let’s stick to polygons for now) and a point x \in P, which points in P can (or cannot) be reached by billiard trajectories through x? A point y which can be reached from x is said to be  illuminated from x.

Billiard trajectories very much resemble light-ray trajectories (at least locally)—indeed the word “optical” appeared in our description of billiard systems—and so it should be no surprise that the study of billiard systems and hence of translation surfaces yields insight into this and related problems. Indeed, as this wonderfully-named paper notes, the illumination problem “elementary properties which can be fruitfully studied using the dynamical behavior of the \mathrm{SL}(2,\mathbb{R})-action on the moduli space of translation surfaces.”

Using the magic wand theorem, and that the geometric properties considered in the illumination problem produce closed sets of the moduli space \mathcal{H}Lelièvre-Monteil-Weiss have proved that, for any P and any x \in P, there are finitely many y \in P which are not illuminated from x.

The wind-tree model

The wind-tree model was originally formulated by statistical physicists Paul and Tatiana Ehrenfest as a model for a Lorenz gas: in this model, particles (the “wind”) travel in straight-line trajectories in the plane \mathbb{R}^2, reflecting off rectangular obstacles (“trees”) placed along a \mathbb{Z}^2 lattice in a billiards-like fashion. One can also describe it, precisely, as billiards in the plane with these rectangles removed.

One can form a translation surface by restricting to some suitable subset of the plane and obstacles, and gluing the sides together: (figure taken from the also wonderfully-named “Cries and whispers in wind-tree forests“)


The result is a genus-5 flat surface in the stratum \mathcal{H}(2^4).

One can then describe the behavior of the trajectories in terms of properties of the translation surface, e.g. Delecroix-Hubert-Lelièvre have computed the diffusion of divergent trajectories, for rectangular obstacles of any size, in terms of the Lyapunov exponents of  a natural dynamical system (the Kontsevich-Zorich cocycle) on a certain stratum of genus-5 translation surfaces—not the one specified above, but a quotient thereof.


Alex Wright’s article describes Eskin-Mirzakhani-Mohammadi result, the context for it, as well as applications and connections to nearby areas.


Translation surfaces and friends

Recall that we earlier defined translation surfaces as maximal atlases of coordinate charts into the Euclidean plane \mathbb{R}^2, with finitely many cone singularities and where the transition maps between coordinate charts are translations in \mathbb{R}^2.

Recall that we were led to this notion starting from polygonal billiards, but not every translation surface, generally defined, comes from a rational polygonal billiard.

Enter Riemann surfaces and moduli spaces

A translation surface may also be thought of as a Riemann surface, together with an associated holomorphic 1-form—a continuous choice of specified direction (“up”) at every point; such a pair of structures canonically defines a (singular) flat structure on the surface, with a distinguished vertical direction.

To wit: we take the holomorphic 1-form \omega to be (locally) our dz. Since \omega is holomorphic, its zeroes are isolated: a zero of degree d corresponds exactly to a conical singularity with cone angle 2\pi(d+1). Where \omega is not zero, it is represented (locally) by dz, whence we obtain a complex coordinate z, which in turn specifies (locally) a Euclidean structure. Since dz is globally well-defined on our translation surface, the resulting flat structure is also well-defined, away from the conical singularities.

Conversely, given a flat structure with conical singularities P_1, \dots, P_n, and a specified direction, consider a fundamental polygon P_1P_2 \cdots P_n embedded (anywhere, but with orientation dictated by the specified direction) in the complex plane. The fundamental polygon inherits a natural complex coordinate z. This does not descend to the translation surface, but since the identification maps are all of the form z \mapsto z + \zeta where \zeta \in \mathbb{C} is a constant (for each identification map), the holomorphic 1-form dz does, and we obtain a holomorphic 1-form \omega on our translation surface. \omega is zero exactly at the conical singularities, with cone angle being related to degree as above.

The Riemann surface perspective emphasizes more clearly the presence of a moduli space of translation surfaces (on for a fixed genus g), which we presently define as a space of pairs (X, \omega), where X denotes a Riemann surface structure, and \omega a choice of holomorphic 1-form, modulo some natural equivalence relation. The 1-form specifies a distinguished direction—so there is a well-defined notion of “north” on the surface. Two such pairs (X_1, \omega_1) and (X_2, \omega_2) are considered equivalent if there is a conformal map from X_1 to X_2 which takes (specified) zeroes of \omega_1 to (specified) zeroes of \omega_2.

The moduli space is divided into distinct strata \mathcal{H}(d_1, \dots, d_m), consisting of forms with zeroes of degree d_1, \dots, d_m with d_1 + \dots + d_m = 2g-2. This last identity follows from the formula for the sum of degrees of zeroes of a holomorphic 1-form on a Riemann surface of genus g, and can be interpreted as a Gauss-Bonnet formula for the singular flat metric.

There is a \mathrm{SL}_2\mathbb{R} action on this moduli space (or, really, on each stratum) which is most easily described back in the framework of flat geometry: given any pair (X,\omega), build the corresponding flat polygon (with distinguished vertical direction.) Elements of \mathrm{SL}_2\mathbb{R} act (linearly) on these flat polygons, thought of as being embedded in the Euclidean plane, with (say) some vertex at the origin. This action preserves parallelisms between edges; and hence is (induces) an action on the corresponding flat surfaces.


(figure from the Zorich survey—the \mathrm{SL}(2,\mathbb{R}) action, depicted on the left, changes the affine structure but not the translation structure; the “cut-and-paste” on the right changes the translation structure, but not the affine structure.)

We can study the dynamics of this action, and this turns out to be surprisingly (or perhaps not surprisingly—or maybe that is only with the benefit of giant-assisted hindsight) rich …

(There were already hints of this in the last post, when we talked about “changing the translation structure while preserving the affine structure” and how Masur used this idea to prove his theorem on counting closed geodesics.)

Counting closed geodesics and saddle connections

As previously noted in the slightly more restricted context of rational polygonal billiards, directional flow on a flat surface is uniquely ergodic in almost every direction.

This is most transparently seen in the case of a torus: geodesics with rational slopes are closed, while those with irrational slopes are equidistributed. Geodesics on flat surfaces of higher genera exhibit certain similiarities: the closed geodesics also appear in parallel families, although in higher genera these do not fill the whole surface, but only flat cylinders with conical singularities on the boundary.

Related to closed geodesic are saddle connections, which are geodesic segments joining two conical singularities (which may coincide), with no conical points in their interior. On a flat torus there are no conical singularities, and so any closed loop can be tightened to a closed geodesic; on a more general flat surface, this tightening process can produce either a closed geodesic, or—if at some point in the process we hit one of the singularities on the surface, which, as one might imagine, is the more generic case—a union of saddle connections.

Masur and Eskin have found quadratic asymptotics, as a function of length, for the number of [cylindrical families of] closed geodesics—this was discussed more in the previous post—and the number of saddle connections. The constants which appear in these asymptotics are called the Siegel-Veech constants. There are also (somewhat surprising) quadratic asymptotics for multiple cylindrical families in the same parallel direction.

These results are interesting in their own right—on a [rational] billiard table, for instance, generalized diagonals (trajectories joining two of the corners, possibly after reflections) unfold to saddle connections and periodic trajectories unfold to closed regular geodesics—but are also useful for at least two other reasons:

One, degeneration of “configurations” of parallel saddle connections or closed geodesics leads us into cusped regions in the boundary of (strata in the) moduli space, and so a description of such configurations gives us a description of the cusps of our strata. Local considerations involving short / degenerating saddle connections also lead us to relations between and structural results about the strata, which can be described more carefully / analytically in the language of Abelian differentials.

Two, configurations of saddle connections and closed geodesics are also useful as invariants of \mathrm{SL}(2,\mathbb{R}) orbits—something that we will refer back to below.

Volume of moduli space

We remark that this last counting problem can be related to computations of volumes of the moduli space, via the observation that the Siegel-Veech constants can be obtained as a limit of the form \lim_{\epsilon \to 0} \frac{1}{\pi \epsilon^2} \frac{\mathrm{Vol}(\epsilon\mbox{-neighborhood of cusp }\mathcal{C})}{\mathrm{Vol} \mathcal{H}_1^\circ(d_1, \dots, d_n)} where \mathcal{C} is a specified configuration of saddle connections or closed geodesics.

Athreya-Eskin-Zorich used this idea to obtain explicit formulas (conjectured by Kontsevich based on experimental evidence) for the volumes of strata in genus 0, by counting generalized diagonals and periodic trajectories on right-angled billiards. In general, of course, the relation between the two problems can be exploited in both directions: results on volumes of strata can also be used to obtain results on the counts of saddle connections / closed geodesics.

There are a number of alternative strategies for finding these volumes; the following is so far the most general:

The general idea is to simply use asymptotics for counts of integer lattice points, where the lattice is defined in terms of cohomological period coordinates. We can count the number of such points in a sphere or hyperboloid (which is the unit sphere for an indefinite quadratic form of suitable signature) to estimate its volume, and take a derivative to estimate the volume of the boundary hypersurface.

Integer lattice points may be thought of, geometrically, as flat surfaces tiled by square flat tori, and combinatorial geometric methods may be used to count these: in the simplest cases we can count directly; slightly more generally we can consider the graph with conical points as vertices and horizontal saddle connections as edges, leading to the notions of ribbon graphs and separatrix diagrams.

In the most general case we turn, following an idea of Eskin-Okounkov, to representation theory: suppose there are N squares; label them, and consider the permutation \pi on [N] which sends j to the square \pi(j) which we get to by starting at j and moving left, up, right, and down in turn. For the generic square j, \pi fixes j, but near the conical points it acts non-trivially, and indeed it is a product of m cycles of lengths (d_1+1), \dots, (d_m+1).

It then suffices to count the number of permutations of N with such a property … except there is a nontrivial correction needed to pick out only those permutations which correspond to connected square-tiled surfaces. Eskin-Okounkov-Pandharipande pushed through this strategy to obtain explicit quantitative results, with a strong arithmetic flavor. These results may be made explicit, although there is considerable computational work involved; by comparison, other approaches such as the work of Athreya-Eskin-Zorich referred to above can produce simpler formulas in special cases.


We have described above and previously how translation surfaces are related to (rational) polygonal billiards, and how the former provide a powerful framework for the study of the latter. Here (and in a subsequent post) we present a number of other applications:

Electron transport

S. P. Novikov suggested the following as a mathematical formulation of electron transport in metals: consider a periodic surface \widetilde{M^2} \subset \mathbb{R}^3; an affine plane in \mathbb{R}^3 intersects this in some union of closed and unbounded intervals. Question: how does an unbounded component propagate in \mathbb{R}^3 (as we move the affine plane in some continuous fashion?)

After we quotient by the period lattice (taken to be \mathbb{Z}^3), we are looking at plane sections of the quotient surface M^2 \subset \mathbb{T}^3. Our original intersection can be viewed as level curves of a linear function f(x,y,z) = ax + by + cz restricted to \widetilde{M^2}, but this does not push down to the quotient; instead, we consider the codimension-one foliation of M^2 defined by the closed 1-form df = a\,dx + b\,dy + c\,dz.

Our question can then be reformulated as follows: what do lifts of leaves of this foliation on M^2 \subset \mathbb{T}^3 look like upstairs, in \widetilde{M^2} \subset \mathbb{R}^3?

Closed 1-forms on surfaces can be straightened to geodesic foliations in appropriate flat metrics iff any cycle formed from a union of closed paths following a sequence of saddle connections is homologically non-trivial. The surfaces and 1-forms obtained from Novikov’s problem can be modified (decomposed and surgered) to satisfy this criterion, and after these reductions we are left exactly in the world of flat structures with closed 1-forms on them, i.e. translation surfaces.


In a subsequent post we describe the illumination problem; a related problem concerns invisibility—more precisely: whether a body with mirror surfaces can be “invisible” from some direction/s, because light rays travelling in these direction/s are reflected in such a way that they continue along trajectories in the same direction/s; or whether a body can be similarly made invisible in certain driections through the strategic placement of mirrors around it. It is a conjecture of Plakhov the set of directions that are invisible for any fixed body has measure zero. This conjecture is closely connected with the (similar) Ivrii conjecture on the measure of the set of periodic billiard trajectories in a bounded domain: if Ivrii’s conjecture is true then, most probably, true also is the conjecture on invisible light rays.

Flat (but not very), and real affine …

There are at least two ways in which the notion of translation surfaces can be generalized: one is to consider flat structures with non-trivial linear holonomy, which “forces a generic geodesic to come back and to intersect itself again and again in different directions.” The other to consider real affine structures, which are maximal collections of charts on a closed surface where all of the transition maps are of the form f(z) = az+b where a > 0 (and is in particular real) and b \in \mathbb{C}.

These remain rather less well-studied and more mysterious, or, in other words / from another perspective, present potentially rich sources of interesting open problems …


Zorich’s survey covers a broad range of ideas, and contains many further references

Hubert-Masur-Schmidt-Zorich have a (slightly outdated) list of open problems on translation surfaces , from a conference at Luminy.

Literature Review

Relatively hyperbolic groups

Hyperbolic groups have all sorts of nice properties—they have linear isoperimetric inequalities, solvable word problem and conjugacy problem, are biautomatic, and so on. Two prime motivating examples of hyperbolic groups are fundamental groups of closed hyperbolic manifolds on the one hand, and free groups on the other.

Relaxing this definition a little, we remark that fundamental groups of cusped hyperbolic manifolds should still have many of the properties of hyperbolic groups, at least away from the cusps, where the characteristic properties of negative curvature break down a little.

This motivates (one) definition of a relatively hyperbolic group: a group is hyperbolic relative to a collection of subgroups \{H_1, \dots, H_k\} if G acts (geometrically) on some hyperbolic metric space X s.t. the quotient X / G is quasi-isometric to the union of k copies of [0,\infty) joined at 0.

Intuitively: each of the peripheral subgroups H_k corresponds to a cusp, or some region where hyperbolicity breaks down; under a quasi-isometry which sends the compact core of the manifold to a point, each cusp (each of these “bad regions”) should be quasi-isometric to a ray going out to infinity.

Various definitions

This definition was formulated by Gromov in his seminal 1987 monograph on hyperbolic groups. There are many other definitions, coming from different motivations, many of which are equivalent. We describe them briefly here, dropping many technical adjectives (for full, careful statements, see e.g. section 3 of Hruska’s paper linked just above.)

Dynamical reformulations

First there is a slight reformulation of this by Bowditch—G is hyperbolic relative to its maximal parabolic subgroups* \mathbb{P} if it acts on a hyperbolic metric space X, and the action is co-compact away from some equivariant collection of horoballs (in X) centered at the parabolic points of G.

*(Technically there can be infinitely many of these, but for the purposes of the definition, and for arguments, it suffices to take a set of representatives of conjugacy classes of maximal parabolics, which is [more often] finite.)

“Parabolic” here is defined in dynamical terms. We start with a dynamical axiomatization of Kleinian group actions on their limit sets: a (nonelementary) convergence group action is an action of a group G on a compact metric space with at least three points* s.t. the induced action of G on the space of distinct triples is properly discontinuous.

*(there are also elementary convergence group actions, which are the analogous objects when |M| \leq 2, but we omit them here in the interest of brevity; see e.g. Hruska.)

Given a convergence group action, a loxodromic element is one which has infinite order and exactly two fixed points in M, and a subgroup P \leq G is parabolic if it is infinite and contains no loxodromic element. A parabolic subgroup P has a unique fixed point p \in M, which we call a parabolic point; stabilizers of parabolic points are maximal parabolic subgroups. A parabolic point p is bounded if its stabilizer acts cocompactly on M - \{p\}.

Finally, a point \xi \in M is a conical limit point if it exhibits a sort of generalized north-south dynamics (again, for the exact formulation, see e.g. Hruska), and a convergence group action is geometrically finite if every point of M is ether a conical limit point or a bounded parabolic point.

If that was too many definitions in a row: think of the case of Kleinian groups acting on their limit sets. To see that this axiomatization is a useful generalisation, we may point to e.g. a result of Tukia that shows every properly discontinuous action of a group G on a proper hyperbolic metric space induces a convergence group action on the boundary at infinity.

The theory of geometrically finite convergence group actions allows us to make another definition of relatively hyperbolic subgroups, proposed by Bowditch and worked out by Yaman: G is hyperbolic relative to a collection of subgroups \mathbb{P} if it admits a geometrically finite convergence group action on a compact metric space M, with \mathbb{P} as [a set of representatives of conjugacy classes of] maximal parabolics.

In fact, we can take M to be (i.e. M is G-equivariantly homeomorphic to) \partial_\infty X for some hyperbolic metric space X on which G acts on properly; even more specifically, we may take that X to be the Cayley graph of G with combinatorial horoballs attached over the peripheral cosets (= cosets of the peripheral subgroups.)

These combinatorial horoballs are graphs (or in some cases 2-dimensional simplicial complexes) whose natural simplicial metrics combinatorially mimic the geometry of actual horoballs in negatively-curved spaces.

In one construction, formulated by Bowditch, a combinatorial horoball over \Gamma is the graph with vertex set P \times \mathbb{Z}_{\geq 0} and the “obvious” horizontal and vertical edges. The vertical edges are all assigned length 1, whereas the horizontal edges at level P \times \{n\} are assigned length 2^{-n}. This has the effect of making the most efficient path between two points distance n apart in the same peripheral coset a horizontal path at level \sim \log n, bookended by vertical ascent to / descent from that level.

A slightly different construction is given by Groves and Manning: their combinatorial horoballs have the same vertex set and vertical edges, but the horizontal edges at level k are different: such an edge exists between any (v,k) and (w,k) whenever 0 < d_\Gamma(v,w) \leq 2^k, and all of these edges have length 1. There are also (horizontal and vertical) 2-cells attached, although these are ignored when regarding the combinatorial horoball as a metric space.

This last definition might be thought of a dynamical reformulation / abstraction / deconstruction of Gromov’s original definition.

Electrifying and fine alternatives

A different definition was proposed by Farb for finitely-generated groups, and generalised to non-f.g. groups by Bowditch and Osin: a group G is hyperbolic relative to a collection of subgroups \mathbb{P} = \{P_1, \dots, P_n\} iff the electrified Cayley graph, formed by taking a Cayley graph by adding to the Cayley graph a vertex (“cone point”) for each left coset gP_i and edges of length 1/2 from this new vertex to each element of gP_i, is Gromov-hyperbolic, and exhibits Bounded Coset Penetration (BCP).

This definition is motivated more directly by the structure of a free product acting on its Bass-Serre tree, where geodesics pass through vertices in very controlled ways.

The BCP condition aims to mimic this, in a quasi sense: in short (the full statement is rather technical), it gives control, up to bounded error, over quasi-geodesics which penetrate (pass through) cosets (cusp neighborhoods.) Given such a quasi-geodesic, call the vertex immediately preceding the cone point the entering vertex, and the one immediately following the cone point the exiting vertex. BCP then stipulates that for any two quasigeodesics \gamma, \gamma' which start and end at (essentially) the same point,

  1. if \gamma and \gamma' penetrate a coset gP, then the entering vertices of \gamma and \gamma' are bounded distance apart in the [unelectrified] Cayley graph (with bound depending only on the quasi-geodesic constants), as are the exiting vertices;
  2. if \gamma penetrates gP but not \gamma' does not, then the entering and exiting vertices of \gamma are bounded distance apart.

(In the language of cusps: if the quasi-geodesics both go through a cusp neighborhood, then they stay close near where they enter and exit; if one goes through a cusp, but not the other, then the former cannot stay in the cusp for very long.)

An abstraction of Farb’s approach was proposed and explored by Bowditch: call a graph is called fine if each of its edges is contained in finitely many cycles of length n for each n. Fine graphs capture, in graph-theoretic terms, the BCP, although their equivalence can be, not to put too fine a point on it, subtle.

Fine graphs give us the following (fifth) definition of relative hyperbolicity: G is hyperbolic relative to a collection of subgroups \mathbb{P} if it acts acts (properly discontinuously and co-compactly) on a fine Gromov-hyperbolic graph, with \mathbb{P} a set of representatives of the conjugacy classes of infinite vertex stabilizers.

Bowditch gives an explicit construction of this graph: starting with a hyperbolic space on which G acts (e.g. the Cayley graph augmented with combinatorial horoballs, as described above), and form a graph K with a vertex for each horoball (of fixed level t), and an edge between two vertices if the corresponding horoballs are \leq 2t apart.

Via relative Dehn functions

Osin gives a different (sixth) definition in terms of relative Dehn (isoperimetric) functions: G is hyperbolic relative to \mathbb{P} = \{P_1, \dots, P_n\} if it has a finite relative presentation, and the relative Dehn function of G is well-defined and linear for some (and hence every) finite relative presentation.

Here a relative presentation is a set S which together with the peripheral subgroups generates G and a set of “relators” whose normal closure K is the kernel of F(S) * (* \mathcal{P}) \to G, and a relative Dehn function is a Dehn function for the relative presentation, with conjugating elements for the relators taken from (for a less terse / cryptic definition, again see e.g. Hruska, or Osin.)

The motivating model geometry (according to Hruska—I’m not sure I see it at the moment) is apparently still essentially that of a free product acting on its Bass-Serre tree.

Basically a tree grading

Considering what geodesics in a relatively hyperbolic group can look like—they essentially run from cusp (peripheral coset) to cusp (peripheral coset) along more-or-less hyperbolic geodesics (see below)—motivates (or, actually, yields, after some proof) a different (seventh!) definition, given by Druțu-Sapir: a group G is hyperbolic relative to \mathbb{P} = \{P_1, \dots, P_n\} if all of its asymptotic cones are tree-graded, with the pieces being left cosets of the \mathbb{P}.

Or, less precisely but without using the words “asymptotic cone”: relatively hyperbolic groups look coarsely like tree-graded spaces, with the peripheral cosets being the pieces.

One may note the similarities between this picture and the geometry of a CAT(0) space with isolated flats, and indeed these were an important motivation for Druțu-Sapir.

Equivalence of notions

All of the above definitions are equivalent for finitely-generated groups, and almost all of them—except the tree-graded one, whose definition requires finite generation—are equivalent for countable groups.

Nevertheless, as noted above, they have disparate motivating origins, which hints at the possible richness of the theory and of techniques which can be applied to study relatively hyperbolic groups.


As pointed above, fundamental groups of punctured hyperbolic surfaces are prime motivating examples of relatively hyperbolic groups. These are hyperbolic relative to their cusp groups (which are all infinite cyclic groups.)

Free products, relative to their free factors, are another prime motivating example. Indeed Gromov originally described his formulation of relative hyperbolicity, in his landmark paper on hyperbolic groups, as “a hyperbolic version of small cancellation theory over free products by adopting geometric language of manifolds with cusps”.

CAT(0) groups acting on spaces with isolated flats are hyperbolic relative to the stabilizers of maximal flats are a third important class of examples, as noted above in the description of the Druțu-Sapir definition based on tree-graded spaces.

Behrstock-Hagen, building on work on Hruska-Kleiner, have a criterion, in terms of the simplicial boundary, for when cubulated groups are hyperbolic relative to specified families of subgroups. In particular, not all CAT(0) groups, or even all cubulated groups, can be relatively hyperbolic—e.g. right-angled Artin groups (RAAGs) are not.

The non-examples are just as important as the examples, in terms of pointing out what the theory is good for and what its limitations are. For instance, higher-rank free abelian groups (e.g. \mathbb{Z}^2) are not hyperbolic relative to any finite collection of their subgroups (e.g. \{\mathbb{Z}\})—the electrified Cayley graph here is hyperbolic, but not fine, i.e. the BCP is not satisfied. Indeed, in some sense, there are “too many bad regions” which are “not sufficiently separated”, and so the theory of relative hyperbolicity does not help here.

Mapping class groups are not hyperbolic relative to any collection of subgroups, by a result of Behrstock–Druțu–Mosher, except in sporadic cases where they are virtually free; the same authors used similar arguments to show that many other classes of groups, including outer automorphism groups of free groups, lattices in higher-rank Lie groups, and fundamental groups of graph manifolds, are not hyperbolic relative to any collection of their subgroups. The key notion in their arguments is that of thickness, which appears to capture, intuitively, the notion of flats or cusps clustering or interacting in a way which conspires against the slight weakening of negative curvature implied by relative hyperbolicity.

A different notion of weakened hyperbolicity, hierarchical hyperbolicity, inspired by the structure of the mapping class group as described by Masur-Minsky, does apply to many (though not all) of these examples: mapping class groups and RAAGs (indeed, all cubulated groups) are hierarchically hyperbolic, for instance.


Quite a few of these were formulated as “equivalent definitions” above, in particular,

Which makes one wonder a little—where is the line, if there is any, between “definition” and “property”?

Relative hyperbolicity and hyperbolicity

A group which is relatively hyperbolic to a collection of peripheral subgroups, each of which is (word-)hyperbolic, is itself hyperbolic.

Conversely, hyperbolic groups are hyperbolic relative to collections of quasiconvex malnormal subgroups with bounded coset intersection—e.g. the fundamental group of a once-punctured torus (which is \cong F_2 = \langle a, b \rangle) is hyperbolic, but also hyperbolic relative to the cusp group \langle [a,b] \rangle.

Broadly speaking, it seems possible to push through arguments that lead to analogues of properties of hyperbolic groups in many cases, but we don’t seem to have as many nice general results as we do in the hyperbolic case …

Quasiconvex subgroups and distortion

As an illustration of this broad principle: Hruska, in his paper linked to above, defined a notion of relatively quasiconvex subgroups of relatively hyperbolic groups, showed that these are also relatively hyperbolic, that the intersection of relatively quasiconvex subgroups is relatively quasiconvex, and that every undistorted subgroup of a finitely-generated relatively hyperbolic group is relatively quasiconvex.


What do geodesics here look like? The Druțu-Sapir definition provides some answers to this question: relatively hyperbolic groups are coarsely tree-graded, and so we can expect properties of geodesics in tree-graded spaces to hold coarsely for (Cayley graphs of) relatively hyperbolic groups, as described by Alex Sisto on his blog.

For instance: geodesics in tree-graded spaces are essentially unique, modulo what they do within the pieces; in a relatively hyperbolic group, this remains true (roughly speaking), up to some bounded error in where the geodesic enters and exits the pieces.

A more precise way of formulating this is in terms of projections \pi_P to each of the pieces P: given any point x in our space, \pi_P(x) is the unique point in P which every geodesic from x to P must go through. With these projections defined, we can say even more. In a tree-graded space, if \pi_P(x) \neq \pi_P(y), then any geodesic between two points must go through P. The appropriately coarsified version of this is true for relatively hyperbolic groups:

If the images of two points x and y under projection to a peripheral coset P are at least C apart (where C is some constant that depends only on the group and choice of peripheral subgroups), then any geodesic between x and goes from to near \pi_P(x), tracks P to \pi_P(y), and then goes to y from there. The geodesic may track several peripheral subsets, in turn, this way, going between them in an essentially unique (hyperbolic) way; additional structural results, again analogous to results for tree-graded spaces, tell us more about the order in which these peripheral subsets appear, and so on.

Geodesic flows?

Gromov defined—and Champetier clarified his definition of—a geodesic flow on any word-hyperbolic group. Mineyev gave a more general construction, using a homological bicombing—roughly speaking, a homology analogue of a map which associates to each pair of group elements g, h a geodesic (segment) between them, extended also in a sensible way to the Gromov boundary. Mineyev’s construction, in fact, more generally produces a flow on any hyperbolic simplicial complex, so it may be possible to apply it more or less directly to obtain some sort of geodesic flow on a relatively hyperbolic group, or at any rate on its cusped space.

Groves-Manning, using their combinatorial horoballs (they call the result of attaching these to the Cayley graph “the cusped space”),  construct an analogue of Mineyev’s homological bicombing for relatively hyperbolic groups. It may be possible to use this to obtain a geodesic flow on a relatively hyperbolic group, analogous to Mineyev’s original construction.

Both of these may in fact be possible, and one of them may have better / more natural properties—likely the latter (?): the former seems more naturally a flow on the cusped space, and may or may not descend in a reasonable way to the group / original Cayley graph.

(The existing literature does not seem to have anything explicitly / directly addressing either of these possibilities.)