Literature Review

Relatively hyperbolic groups

Hyperbolic groups have all sorts of nice properties—they have linear isoperimetric inequalities, solvable word problem and conjugacy problem, are biautomatic, and so on. Two prime motivating examples of hyperbolic groups are fundamental groups of closed hyperbolic manifolds on the one hand, and free groups on the other.

Relaxing this definition a little, we remark that fundamental groups of cusped hyperbolic manifolds should still have many of the properties of hyperbolic groups, at least away from the cusps, where the characteristic properties of negative curvature break down a little.

This motivates (one) definition of a relatively hyperbolic group: a group is hyperbolic relative to a collection of subgroups \{H_1, \dots, H_k\} if G acts (geometrically) on some hyperbolic metric space X s.t. the quotient X / G is quasi-isometric to the union of k copies of [0,\infty) joined at 0.

Intuitively: each of the peripheral subgroups H_k corresponds to a cusp, or some region where hyperbolicity breaks down; under a quasi-isometry which sends the compact core of the manifold to a point, each cusp (each of these “bad regions”) should be quasi-isometric to a ray going out to infinity.

Various definitions

This definition was formulated by Gromov in his seminal 1987 monograph on hyperbolic groups. There are many other definitions, coming from different motivations, many of which are equivalent. We describe them briefly here, dropping many technical adjectives (for full, careful statements, see e.g. section 3 of Hruska’s paper linked just above.)

Dynamical reformulations

First there is a slight reformulation of this by Bowditch—G is hyperbolic relative to its maximal parabolic subgroups* \mathbb{P} if it acts on a hyperbolic metric space X, and the action is co-compact away from some equivariant collection of horoballs (in X) centered at the parabolic points of G.

*(Technically there can be infinitely many of these, but for the purposes of the definition, and for arguments, it suffices to take a set of representatives of conjugacy classes of maximal parabolics, which is [more often] finite.)

“Parabolic” here is defined in dynamical terms. We start with a dynamical axiomatization of Kleinian group actions on their limit sets: a (nonelementary) convergence group action is an action of a group G on a compact metric space with at least three points* s.t. the induced action of G on the space of distinct triples is properly discontinuous.

*(there are also elementary convergence group actions, which are the analogous objects when |M| \leq 2, but we omit them here in the interest of brevity; see e.g. Hruska.)

Given a convergence group action, a loxodromic element is one which has infinite order and exactly two fixed points in M, and a subgroup P \leq G is parabolic if it is infinite and contains no loxodromic element. A parabolic subgroup P has a unique fixed point p \in M, which we call a parabolic point; stabilizers of parabolic points are maximal parabolic subgroups. A parabolic point p is bounded if its stabilizer acts cocompactly on M - \{p\}.

Finally, a point \xi \in M is a conical limit point if it exhibits a sort of generalized north-south dynamics (again, for the exact formulation, see e.g. Hruska), and a convergence group action is geometrically finite if every point of M is ether a conical limit point or a bounded parabolic point.

If that was too many definitions in a row: think of the case of Kleinian groups acting on their limit sets. To see that this axiomatization is a useful generalisation, we may point to e.g. a result of Tukia that shows every properly discontinuous action of a group G on a proper hyperbolic metric space induces a convergence group action on the boundary at infinity.

The theory of geometrically finite convergence group actions allows us to make another definition of relatively hyperbolic subgroups, proposed by Bowditch and worked out by Yaman: G is hyperbolic relative to a collection of subgroups \mathbb{P} if it admits a geometrically finite convergence group action on a compact metric space M, with \mathbb{P} as [a set of representatives of conjugacy classes of] maximal parabolics.

In fact, we can take M to be (i.e. M is G-equivariantly homeomorphic to) \partial_\infty X for some hyperbolic metric space X on which G acts on properly; even more specifically, we may take that X to be the Cayley graph of G with combinatorial horoballs attached over the peripheral cosets (= cosets of the peripheral subgroups.)

These combinatorial horoballs are graphs (or in some cases 2-dimensional simplicial complexes) whose natural simplicial metrics combinatorially mimic the geometry of actual horoballs in negatively-curved spaces.

In one construction, formulated by Bowditch, a combinatorial horoball over \Gamma is the graph with vertex set P \times \mathbb{Z}_{\geq 0} and the “obvious” horizontal and vertical edges. The vertical edges are all assigned length 1, whereas the horizontal edges at level P \times \{n\} are assigned length 2^{-n}. This has the effect of making the most efficient path between two points distance n apart in the same peripheral coset a horizontal path at level \sim \log n, bookended by vertical ascent to / descent from that level.

A slightly different construction is given by Groves and Manning: their combinatorial horoballs have the same vertex set and vertical edges, but the horizontal edges at level k are different: such an edge exists between any (v,k) and (w,k) whenever 0 < d_\Gamma(v,w) \leq 2^k, and all of these edges have length 1. There are also (horizontal and vertical) 2-cells attached, although these are ignored when regarding the combinatorial horoball as a metric space.

This last definition might be thought of a dynamical reformulation / abstraction / deconstruction of Gromov’s original definition.

Electrifying and fine alternatives

A different definition was proposed by Farb for finitely-generated groups, and generalised to non-f.g. groups by Bowditch and Osin: a group G is hyperbolic relative to a collection of subgroups \mathbb{P} = \{P_1, \dots, P_n\} iff the electrified Cayley graph, formed by taking a Cayley graph by adding to the Cayley graph a vertex (“cone point”) for each left coset gP_i and edges of length 1/2 from this new vertex to each element of gP_i, is Gromov-hyperbolic, and exhibits Bounded Coset Penetration (BCP).

This definition is motivated more directly by the structure of a free product acting on its Bass-Serre tree, where geodesics pass through vertices in very controlled ways.

The BCP condition aims to mimic this, in a quasi sense: in short (the full statement is rather technical), it gives control, up to bounded error, over quasi-geodesics which penetrate (pass through) cosets (cusp neighborhoods.) Given such a quasi-geodesic, call the vertex immediately preceding the cone point the entering vertex, and the one immediately following the cone point the exiting vertex. BCP then stipulates that for any two quasigeodesics \gamma, \gamma' which start and end at (essentially) the same point,

  1. if \gamma and \gamma' penetrate a coset gP, then the entering vertices of \gamma and \gamma' are bounded distance apart in the [unelectrified] Cayley graph (with bound depending only on the quasi-geodesic constants), as are the exiting vertices;
  2. if \gamma penetrates gP but not \gamma' does not, then the entering and exiting vertices of \gamma are bounded distance apart.

(In the language of cusps: if the quasi-geodesics both go through a cusp neighborhood, then they stay close near where they enter and exit; if one goes through a cusp, but not the other, then the former cannot stay in the cusp for very long.)

An abstraction of Farb’s approach was proposed and explored by Bowditch: call a graph is called fine if each of its edges is contained in finitely many cycles of length n for each n. Fine graphs capture, in graph-theoretic terms, the BCP, although their equivalence can be, not to put too fine a point on it, subtle.

Fine graphs give us the following (fifth) definition of relative hyperbolicity: G is hyperbolic relative to a collection of subgroups \mathbb{P} if it acts acts (properly discontinuously and co-compactly) on a fine Gromov-hyperbolic graph, with \mathbb{P} a set of representatives of the conjugacy classes of infinite vertex stabilizers.

Bowditch gives an explicit construction of this graph: starting with a hyperbolic space on which G acts (e.g. the Cayley graph augmented with combinatorial horoballs, as described above), and form a graph K with a vertex for each horoball (of fixed level t), and an edge between two vertices if the corresponding horoballs are \leq 2t apart.

Via relative Dehn functions

Osin gives a different (sixth) definition in terms of relative Dehn (isoperimetric) functions: G is hyperbolic relative to \mathbb{P} = \{P_1, \dots, P_n\} if it has a finite relative presentation, and the relative Dehn function of G is well-defined and linear for some (and hence every) finite relative presentation.

Here a relative presentation is a set S which together with the peripheral subgroups generates G and a set of “relators” whose normal closure K is the kernel of F(S) * (* \mathcal{P}) \to G, and a relative Dehn function is a Dehn function for the relative presentation, with conjugating elements for the relators taken from (for a less terse / cryptic definition, again see e.g. Hruska, or Osin.)

The motivating model geometry (according to Hruska—I’m not sure I see it at the moment) is apparently still essentially that of a free product acting on its Bass-Serre tree.

Basically a tree grading

Considering what geodesics in a relatively hyperbolic group can look like—they essentially run from cusp (peripheral coset) to cusp (peripheral coset) along more-or-less hyperbolic geodesics (see below)—motivates (or, actually, yields, after some proof) a different (seventh!) definition, given by Druțu-Sapir: a group G is hyperbolic relative to \mathbb{P} = \{P_1, \dots, P_n\} if all of its asymptotic cones are tree-graded, with the pieces being left cosets of the \mathbb{P}.

Or, less precisely but without using the words “asymptotic cone”: relatively hyperbolic groups look coarsely like tree-graded spaces, with the peripheral cosets being the pieces.

One may note the similarities between this picture and the geometry of a CAT(0) space with isolated flats, and indeed these were an important motivation for Druțu-Sapir.

Equivalence of notions

All of the above definitions are equivalent for finitely-generated groups, and almost all of them—except the tree-graded one, whose definition requires finite generation—are equivalent for countable groups.

Nevertheless, as noted above, they have disparate motivating origins, which hints at the possible richness of the theory and of techniques which can be applied to study relatively hyperbolic groups.


As pointed above, fundamental groups of punctured hyperbolic surfaces are prime motivating examples of relatively hyperbolic groups. These are hyperbolic relative to their cusp groups (which are all infinite cyclic groups.)

Free products, relative to their free factors, are another prime motivating example. Indeed Gromov originally described his formulation of relative hyperbolicity, in his landmark paper on hyperbolic groups, as “a hyperbolic version of small cancellation theory over free products by adopting geometric language of manifolds with cusps”.

CAT(0) groups acting on spaces with isolated flats are hyperbolic relative to the stabilizers of maximal flats are a third important class of examples, as noted above in the description of the Druțu-Sapir definition based on tree-graded spaces.

Behrstock-Hagen, building on work on Hruska-Kleiner, have a criterion, in terms of the simplicial boundary, for when cubulated groups are hyperbolic relative to specified families of subgroups. In particular, not all CAT(0) groups, or even all cubulated groups, can be relatively hyperbolic—e.g. right-angled Artin groups (RAAGs) are not.

The non-examples are just as important as the examples, in terms of pointing out what the theory is good for and what its limitations are. For instance, higher-rank free abelian groups (e.g. \mathbb{Z}^2) are not hyperbolic relative to any finite collection of their subgroups (e.g. \{\mathbb{Z}\})—the electrified Cayley graph here is hyperbolic, but not fine, i.e. the BCP is not satisfied. Indeed, in some sense, there are “too many bad regions” which are “not sufficiently separated”, and so the theory of relative hyperbolicity does not help here.

Mapping class groups are not hyperbolic relative to any collection of subgroups, by a result of Behrstock–Druțu–Mosher, except in sporadic cases where they are virtually free; the same authors used similar arguments to show that many other classes of groups, including outer automorphism groups of free groups, lattices in higher-rank Lie groups, and fundamental groups of graph manifolds, are not hyperbolic relative to any collection of their subgroups. The key notion in their arguments is that of thickness, which appears to capture, intuitively, the notion of flats or cusps clustering or interacting in a way which conspires against the slight weakening of negative curvature implied by relative hyperbolicity.

A different notion of weakened hyperbolicity, hierarchical hyperbolicity, inspired by the structure of the mapping class group as described by Masur-Minsky, does apply to many (though not all) of these examples: mapping class groups and RAAGs (indeed, all cubulated groups) are hierarchically hyperbolic, for instance.


Quite a few of these were formulated as “equivalent definitions” above, in particular,

Which makes one wonder a little—where is the line, if there is any, between “definition” and “property”?

Relative hyperbolicity and hyperbolicity

A group which is relatively hyperbolic to a collection of peripheral subgroups, each of which is (word-)hyperbolic, is itself hyperbolic.

Conversely, hyperbolic groups are hyperbolic relative to collections of quasiconvex malnormal subgroups with bounded coset intersection—e.g. the fundamental group of a once-punctured torus (which is \cong F_2 = \langle a, b \rangle) is hyperbolic, but also hyperbolic relative to the cusp group \langle [a,b] \rangle.

Broadly speaking, it seems possible to push through arguments that lead to analogues of properties of hyperbolic groups in many cases, but we don’t seem to have as many nice general results as we do in the hyperbolic case …

Quasiconvex subgroups and distortion

As an illustration of this broad principle: Hruska, in his paper linked to above, defined a notion of relatively quasiconvex subgroups of relatively hyperbolic groups, showed that these are also relatively hyperbolic, that the intersection of relatively quasiconvex subgroups is relatively quasiconvex, and that every undistorted subgroup of a finitely-generated relatively hyperbolic group is relatively quasiconvex.


What do geodesics here look like? The Druțu-Sapir definition provides some answers to this question: relatively hyperbolic groups are coarsely tree-graded, and so we can expect properties of geodesics in tree-graded spaces to hold coarsely for (Cayley graphs of) relatively hyperbolic groups, as described by Alex Sisto on his blog.

For instance: geodesics in tree-graded spaces are essentially unique, modulo what they do within the pieces; in a relatively hyperbolic group, this remains true (roughly speaking), up to some bounded error in where the geodesic enters and exits the pieces.

A more precise way of formulating this is in terms of projections \pi_P to each of the pieces P: given any point x in our space, \pi_P(x) is the unique point in P which every geodesic from x to P must go through. With these projections defined, we can say even more. In a tree-graded space, if \pi_P(x) \neq \pi_P(y), then any geodesic between two points must go through P. The appropriately coarsified version of this is true for relatively hyperbolic groups:

If the images of two points x and y under projection to a peripheral coset P are at least C apart (where C is some constant that depends only on the group and choice of peripheral subgroups), then any geodesic between x and goes from to near \pi_P(x), tracks P to \pi_P(y), and then goes to y from there. The geodesic may track several peripheral subsets, in turn, this way, going between them in an essentially unique (hyperbolic) way; additional structural results, again analogous to results for tree-graded spaces, tell us more about the order in which these peripheral subsets appear, and so on.

Geodesic flows?

Gromov defined—and Champetier clarified his definition of—a geodesic flow on any word-hyperbolic group. Mineyev gave a more general construction, using a homological bicombing—roughly speaking, a homology analogue of a map which associates to each pair of group elements g, h a geodesic (segment) between them, extended also in a sensible way to the Gromov boundary. Mineyev’s construction, in fact, more generally produces a flow on any hyperbolic simplicial complex, so it may be possible to apply it more or less directly to obtain some sort of geodesic flow on a relatively hyperbolic group, or at any rate on its cusped space.

Groves-Manning, using their combinatorial horoballs (they call the result of attaching these to the Cayley graph “the cusped space”),  construct an analogue of Mineyev’s homological bicombing for relatively hyperbolic groups. It may be possible to use this to obtain a geodesic flow on a relatively hyperbolic group, analogous to Mineyev’s original construction.

Both of these may in fact be possible, and one of them may have better / more natural properties—likely the latter (?): the former seems more naturally a flow on the cusped space, and may or may not descend in a reasonable way to the group / original Cayley graph.

(The existing literature does not seem to have anything explicitly / directly addressing either of these possibilities.)


Groups and graphs (and computation)

Graphs in group theory

Given a group G, or more specifically a finite set S of generators for G, one can form the Cayley graph \mathrm{Cay}(G,S) by taking the vertex set to be the set of group elements, and adding an edge vw iff v = ws for some s \in S.

One can then attempt to study algebraic properties of the group by looking at properties of (one of) its Cayley graph(s). In the case of an infinite group G, one can show that any two Cayley graphs, i.e. the Cayley graphs associated to any two sets of generators, have the same coarse geometry (in the sense of quasi-isometry—see the post on geometric group theory), so that the coarse geometric properties of any Cayley graph can be considered to be intrinsic geometric properties of the group, rather than properties of any particular presentation thereof. Indeed, this is the key insight that drives geometric group theory.

One particularly striking example where such geometric methods seem to yield great insight with much less pain than directly algebraic ones is in the case of free groups and their automorphism groups, which may be thought of geometrically as fundamental groups of bouquets of circles. This is equivalent to the Cayley graph viewpoint: the Cayley graph is the universal cover of the bouquet of circles. Automorphisms of the free group then correspond to homotopy equivalences of the bouquet of circles, and arguments about the automorphism group can be conducted in terms of these homotopy equivalences, leading to such results as classifications and characterizations of these automorphisms analogous to the classification of mapping classes of surfaces.

The same arguments apply to finite groups, although in this case the graphs are finite, and so quasi-equivalent to trivial graphs (i.e. the graph with one vertex and no edges), and geometric group theory has nothing more to say. From this point of view, finite groups are, in a very precise sense, virtually trivial. I do wonder though: can the theory of finite graphs, itself a rather rich area of combinatorics, be fruitfully used to say / argue interesting things about finite groups?

Graphs also appear in group theory in the notion of graphs of groups which is fundamental to the Bass-Serre theory of groups acting on simplicial trees. Here the graph acts essentially as a book-keeping mechanism to record the structure of the group as built up from vertex and edge stabilisers; these graphs have also been described as “one-dimensional analogues of orbifolds”. Group actions on trees appear in a variety of settings, for fundamental reasons I do not yet fully understand, and so this notion takes on some importance.

Groups in graph theory

Conversely, group theory can play a role in elucidating the structure theory of graphs. For instance, group theory played a significant role in Laci Babai’s recent breakthrough on graph isomorphism. Very roughly speaking, Babai’s algorithm relies on the heuristic that any isomorphism between graphs has to preserve degree. Thus, given a highly irregular graph, a brute-force search could actually be (sufficiently) efficient, since the number of possible isomorphisms is small relative to the number of vertices.

The problem, then, is if we have highly regular graphs. One way around this is to consider an equitable partition; but one may not always have one of these either. However, Babai proved that the only obstacle to the existence of such a partition is the existence of a large Johnson graph, and then used subtle structural results, of an essentially algebraic nature, on Johnson groups and how they sit inside larger (finite) permutation groups to obtain, constructively, either an isomorphism, or a certificate of non-isomorphism (a provably empty automorphism group.)

In the language of Babai’s preprint: “We shall certify both the presence and the absence of local symmetry. … If we don’t find local symmetry, we infer global irregularity; if we do find local symmetry, we either build it up to global symmetry (find automorphisms) or use any obstacles to this attempt to infer local irregularity which we then synthesize to global irregularity. Building up symmetry means approximating the automorphism group from below; when we discover irregularity,we turn that into an effective upper bound on the automorphism group . When the lower and the upper bounds on the automorphism group meet, we have found the automorphism group and solved the isomorphism problem.”

Here the group theory plays a key role in introducing more rigidity / structure to an otherwise fairly flexible and fluid structure. The structural theory of finite groups is used in a fairly strong way: indeed, the proof relies on the classification of finite simple groups, although the strict logical dependence is only on one of its corollaries (Schreier’s statement that the outer automorphism group of a finite simple group is solvable.)

A concrete application to coding theory

Group theory and graph theory both come into play in e.g. the construction and study of codes from Cayley graphs and Schreier graphs: the algebraic structure introduced by the group-theoretic construction of these graphs is very helpful in proving key desired properties of these codes.

Questions, Reading Notes

Cryptosystems using free group automorphisms

Moldenhauer has proposed proposed cryptographic protocols which make use of \mathrm{Aut}(F_n). Here we describe them, for the fun of it, and because a potential closer analysis of their security suggests potentially interesting problems (or perhaps exercises—I don’t know enough / haven’t spent enough time to be able to judge) regarding automorphisms of free groups.

One-time pads of free group automorphisms

Choose a free group F_q, a key-space (\phi_n)_{n=1}^N consisting of some large number N (say, 2^{128}) of automorphisms of F_q, and a linear congruence generator h: \mathbb{Z}/N \to \mathbb{Z}/N with maximal period length.

To communicate securely, Alice and Bob privately agree on a free subgroup F_U with rank equal to the alphabet size, a (minimal Nielsen-reduced freely-reduced) free generating set U, and a starting seed for the linear congruence generator.

To securely send a message m = m_1 \cdots m_z, Alice generates an equally-long string of congruences k_1 \cdots k_z using h, and sends the ciphertext \phi_{k_1}(m_1) \cdots \phi_{k_z}(m_z) as an unreduced word in F_q, where we implicitly identify letters in the alphabet with corresponding generators of F_U from S. She also sends z.

To decrypt the ciphertext, Bob calculates k_1 \cdots k_z using h and then \phi_{k_r}(u) for all 1 \leq r \leq z and u \in U, and uses this to chunk the ciphertext into words corresponding to individual letters, which are then decrypted using the corresponding \phi_{k_r}^{-1}.

This effectively uses the ginormous size of \mathrm{Aut}(F_n) and its highly noncommutative nature (or, more precisely, the difficulty of guessing \phi_{k_1}, \dots, \phi_{k_z} given h and \phi_{k_1}(m_1) \cdots \phi_{k_z}(m_z), but not k_1) for cryptographic security, although the protocol itself (a one-time pad) is not terribly sophisticated.

Vague hyperbolic language aside, the security of this system really rests on how the automorphisms \phi_i are randomly chosen. Moldenhauer proposes a system which uses binary strings, of varying lengths (details unspecified here) and maps these to Whitehead automorphisms in some systematically random way.

Question: Is the proposed method of  generating the automorphisms secure, or is it vulnerable to heuristic or other attacks?

Public-key encryption with free group automorphisms

Moldenhauer also suggests a public-key protocol: here the public parameters consist of the free group F_q, a freely reduced word w \neq 1 in the free group, and an automorphism f: F \to F of infinite order.

Alice and Bob pick private keys a, b \in \mathbb{Z}_{>0}, and publish as their public keys f^a(w), f^b(w).

To securely send a message m to Bob, Alice computes the freely-reduced elements m \cdot f^a(f^b(w)) =: c and send this to Bob as the ciphertext.

Bob decrypts the message by computes c_1 \cdot (f^b(f^a(w)))^{-1} = m \cdot f^{a+b}(w) f^{-(b+a)}(c) = m.

Security here is based on the difficulty of determining how much cancellation happens. Presumably, though, not all of the plaintext is cancelled most of the time, so there will likely be some substantial chunk of plaintext just hanging out in front, which seems … not optimal.

Slightly more secure might be to have Alice send f^a(m), have Bob not immediately decrypt, but instead compute f^{b+a}(m) and send this back to Alice, and then have Alice compute f^{-a}(f^{b+a}(m)) = f^b(m). Now Bob decrypts the message by computing m =  f^{-b}(f^b(m)).

We can modify this further for use as a signature / authentication protocol: as a signature to Bob, Alice sends f^a(f^b(w)) = f^{a+b}(w) to Bob; Bob computes f^{-b}(f^{a+b}(w)) and verifies that this matches with Alice’s public key f^a(w).

The security of these modified protocols are based on the difficulty of efficiently of deducing and b or m, given only f and the images f^a(m), f^b(m) and f^{a+b}(m); i.e. the difficulty of the (analogue of the) discrete logarithm in generic infinite cyclic subgroups of the free group.

Questions: Do generic infinite-order automorphisms of the free group actually have good properties from the standpoint of these cryptographic applications? How much cancellation can we typically expect in freely reducing m \cdot f^{a+b}(w)? Moldenhauer’s arXiv preprint doesn’t appear to address these questions, although her thesis might.

Also: if Eve observes a lot of the Alice’s comunications secured using this last protocol, she will see a lot of pairs (x, f^a(x)), since f^{a+b}(m) = f^a(f^b(m)), and both f^{a+b}(m) and f^b(m) are exchanged in the protocol. How many of these would Eve need to collect before she can try to effectively reconstruct what f^a is? How much will the cancellation / free reduction created by concatenating signatures (or other material) to the end of messages alleviate this?

Examples, Snippets

A virtually free RACG

(Thanks to Harry Richman for bringing this example up.)

The free group \langle a, b \rangle and the right-angled Coxeter group \rangle a, b, c, d | a^2, b^2, c^2, d^2 \rangle have the same Cayley graph—the infinite 4-valent tree—; hence they are quasi-isometric. We note here that quasi-isometries are not required to group homomorphisms, and indeed the one we have here—an “identity map” of sorts (in an admittedly very loose and imprecise sense) between Cayley graphs—is not.

By results of Stallings and Dunwoody involving ends of groups ad accessibility (although Chapter 20 of Drutu-Kapovich’s draft manuscript seems to be the only reference I can find for this online), free groups are quasi-isometrically rigid, and hence, since finite-index subgroups of free groups are free, the RACG is virtually free (!)

With a little more thought we can explicitly exhibit a finite-index free subgroup: the subgroup of the RACG generated by ab, ac and ad is a nonabelian free group on three generators; we can verify it has finite index via a covering space argument.

Now I wonder if there is a finite-index F_2-subgroup of the RACG as well, so that the commensurability need not involve passing to finite-index subgroups on both sides. It seems like there shouldn’t be, although I can’t quite prove this yet.



Geometric group theory (II)

Groups from geometry

Besides groups which have geometry, there are also groups with various geometric origins, and these make up another broad vein of the federated entity that is geometric group theory. Often these geometric origins allow us to give the groups themselves a certain geometry; but they also give us directly geometric tools for studying these groups.

The first examples we might think of include surface groups and hyperbolic 3-manifold groups, and more generally fundamental groups of geometric objects. And then we start thinking more …

Mapping class groups

The mapping class group Mod(S) of a surface S of finite type is the group of all homeomorphisms from the surface to itself, mod homotopy. These homeomorphisms are required to fix the boundary pointwise, and preserve the set of punctures setwise (i.e. mapping classes may permute punctures.)

The notation originates with Fricke, who called the “automorphic modular group”, by analogy with the classical modular group \mathrm{SL}_2\mathbb{Z} which indeed is the mapping class group of the torus.

Mapping classes represent essential symmetries of the topological surface. Mapping class groups can be computed, in general, using the Alexander method—see e.g. Section 2.3 of Farb and Margalit’s Primer.

A large class of easily visualizable mapping classes is given by Dehn twists. There are many mapping classes which are not Dehn twists, but it turns out that mapping class groups are generated by a finite number of (judiciously-chosen) Dehn twists.

One proof of this proceeds by looking at the action of the mapping class group on a (slightly-modified) curve complex. The mapping class group acts on Teichmüller space by changing the marking—the mapping class \varphi sends the point [(S,h)] to [(S, h \circ \varphi)]—and also on the curve complex and its relatives, by extending the obvious action on the 1-skeleton (curves go to curves, and we can check that adjacency relations in the curve complex are preserved.)

Looking at these two group actions also gives us proofs that mapping class groups are finitely-presented, residually finite, satisfy a subgroup alternative, and so on.

Note that these two actions complement each other, in the sense that Teichmüller space is compact(ifiable) but not (coarsely) hyperbolic, whereas the curve complex is not locally-compact but is coarsely hyperbolic. Playing these off against each other is key in one proof of the subgroup alternative, for instance.

Outer automorphism groups of free groups

Given any group \Gamma in general, the outer automorphism group \mathrm{Out}(\Gamma) consists of all the automorphisms of \Gamma, modulo inner automorphisms.

Outer automorphism groups \mathrm{Out}(F_n) of (finitely-generated, non-abelian) free groups F_n have a particularly geometric flavour, since they may be thought of in terms of their action on a graph with fundamental group F_n, i.e. (or e.g.) a bouquet of n circles.

This leads us to the construction of Culler-Vogtmann outer space CV_n, a space of marked metric graphs, whose points are specified by pairs [(X, h)] where X is a metric graph with fundamental group (isomorphic to) F_n and h: F_n \to \pi_1(X) is an isomorphism of fundamental groups, and the square braces indicate that we are quotienting out by homotopy equivalences.

This appears very analogous to the definition of Teichmüller space, and indeed there is an analogy between Teichmüller theory and the theory of \mathrm{Out}(F_n) which, although somewhat informal and imperfect, has driven and illuminated much of the development of the latter.

Similar perspectives may be brought to bear on the study of \mathrm{Out}(\Gamma) wherever \Gamma is naturally the fundamental group of some nice class of geometric or topological objects, e.g. if \Gamma is a RAAG (which is the fundamental group of its associated Salvetti complexes.)

Coxeter groups and Artin groups

(Most of what appears in this section was taken from either Ruth Charney’s survey notes, or these notes of Luis Paris.)

Coxeter groups are groups which generalize discrete reflection groups. The finite Coxeter groups are precisely the finite Euclidean reflection groups, and indeed any Coxeter group may be realized as a discrete reflection group

Coxeter groups have presentations of the form \left\langle s_1, \dots, s_n | s_i = s_i^{-1}, s_i s_j s_i \cdots = s_j s_i s_j \cdots \right\rangle, where relations of the second sort equal numbers of generators (alternating between s_i and s_j) on both sides. The information needed to write this representation (or a representation for the corresponding Artin group, see below) may be encoded in the form of a finite graph, known as Dynkin diagrams.

Artin groups are also known as generalized braid groups, a braid group being (one way to define it being) the mapping class group of a disk with finitely many punctures. Braid groups have presentations of the form \left\langle \tau_1, \dots, \tau_n | \tau_i \tau_j = \tau_j \tau_i, \tau_i \tau_{i+1} \tau_i = \tau_{i+1} \tau_i \tau_{i+1} \right\rangle. More generalized Artin groups are given by presentations of the form \left\langle s_1, \dots, s_n | s_i s_j s_i \cdots = s_j s_i s_j \cdots \right\rangle, where the relations are of the same form as those which appear above in presentations for Coxeter groups.

We note that these presentations are very similar—the only difference being the involutive relations for the generators in a Coxeter group presentation—, and indeed to every Coxeter group there is an associated Artin group (and vice versa) obtained by removing (or adding, resp.) those involutive relations.

Proceeding along this line of thought leads to a geometric interpretation of Artin groups: whereas a Coxeter group is a group of reflections in some finite hyperplane arrangement, the corresponding Artin group A is the fundamental group of the corresponding complexified hyperplane arrangement Y_W modulo the action of the Coxeter group. Moreover, at least for W finite, Y_W / W is aspherical, so it is in fact a K(pi, 1); the same is conjectured to be true for more general W,

Artin groups associated to finite Coxeter groups (called spherical Artin groups) sharmany properties of braid groups. Non-spherical Artin groups (i.e. those whose associated Coxeter groups are infinite) are much less well-understood, but are conjectured to share many of the same good algorithmic / geometric properties of their spherical cousins.

Lie groups and lattices

A different class of examples comes from remarking that fundamental groups of closed manifolds are lattices in Lie groups, and asking about more general lattices in Lie groups.

This allows us to leverage (algebraic) tools from Lie theory to use alongside our geometric tools to study our examples.

[[ future addition: examples in lower rank / rigidity and superrigidity results in higher rank ]]

Dynamical perspectives: Property (T) and amenability

A third major vein of the not-quite-a-single-subject involves using dynamics to study group actions and what they say about groups. Here (at least) two things stand out.

One is Kazhdan’s property (T): a group has (T) if any unitary representation of a group which has an almost-invariant vector has an invariant vector.  Equivalently, [[long list here]]

[[ future addition: equivalent characterizations of (T) and properties of (T) groups ]]

Kazhdan proved that lattices in Lie groups have (T) iff the ambient Lie groups have (T). Simple Lie groups of higher rank have (T), and thus for n \geq 3, \mathrm{SL}_n\mathbb{Z} has (T). His motivation for all of this was apparently to demonstrate the finite generation of these lattices (it can be shown that any discrete countable group with (T) is finitely-generated.)

It seems a bit of a crazy detour, but perhaps entirely reasonable if one is well-versed and comfortable in the language of unitary representations.

On the other hand, there is amenability: [[ future addition: (at least part of) long list of equivalent definitions ]].

Amenable groups are, from a dynamical perspective at least, similar to abelian groups—many dynamical arguments that work for abelian groups can be made to work, in some form, for amenable groups.

[[ future addition: some  properties of amenable groups ]]

Amenable groups with property (T) are compact, and in some sense amenability and (T) are opposite: amenability makes almost-invariant vectors easy to find, whereas (T) makes them hard to find.

Annotated Definitions

Proper discontinuity

(Credit where due: this is a summary things from MathOverflow 55726 and Math Stack Exchange 1082834.)

Let G be a locally-compact group acting on a locally-compact Hausdorff topological space X (these additional conditions on G and X being certainly satisfied if e.g. G is finitely-generated and X is a proper metric space.)

We say the action is properly discontinuous if for any compact set K \subset X, K \cap gK \neq \varnothing for at most finitely many g \in G. (cf. the action of \mathbb{Z}^n on \mathbb{R}^n by translations, which is discontinuous: for any [sufficiently small] compact set KK \cap gK = \varnothing for every g \in \mathbb{Z}^n; here we allow finite overlap even at arbitrarily small resolution: this allows for finite stabilizers, parabolic elements, etc.)

In particular this implies that every point x \in X has an open neighborhood U s.t. U \cap gU \neq \varnothing for at most finitely many g\ in G. (Take U to sit inside some compact set K.)

Note that the second condition is strictly weaker than the first: consider the \mathbb{Z}-action on the punctured plane given by n \cdot (x,y) = (2^n x, 2^{-n}y). This satisfies the second condition, but not the first—take K to be any closed annulus centered about the origin, for instance.

The main point of imposing these mildly technical conditions when we study “geometric” group actions is this:

This second condition is equivalent to the quotient map X \to X/G being a covering map. The first condition (the definition) serves to ensure that the map G \times X \to X \times X given by (g,x) \mapsto (x,gx) is proper, and hence that the quotient X/G is Hausdorff.


Geometric group theory (I)

Geometric group theory is, really, not so much a single coherent subfield as a somewhat disparate collection of things united by a single underlying philosophy—that the study of group actions on spaces can yield information about the groups (and also about the spaces, although that is perhaps less the remit of geometric group theory than of other subfield/s.)

This is a a quite natural way of looking at groups if we recall that historically, groups were first formulated as abstractions to capture the algebraic structures of sets of automorphisms—symmetry groups, Galois groups, matrix groups, monodromy groups, and the like.

A close cousin of this philosophy—or perhaps another way of articulating it—is the idea that groups themselves can be viewed as geometric objects (see below), and that this perspective can yield fresh insights on their algebraic and algorithmic properties.

To make these philosophies effective we need some assumptions on the group G, the space X, and the group action G \curvearrowright X. The following are by no means the only ones that can be made to work, but are the most common:

  • To get a sensible notion of geometry, we assume that  X is a geodesic metric space, i.e. X is equipped with a metric d, and between any two points x, y \in X there is some rectifiable path whose length is equal to d(x,y).
    (Note that we do not assume X is Riemannian or Finsler; in general we define the length of a path \gamma as \sup \sum d(x_i, x_{i+1}), where the sup is taken over all finite subdivisions of \gamma.)
    Sometimes this can be weakened to just assuming that X is a length space, i.e. we do not necessarily have paths which achieve d(x,y), but d(x,y) is still the infimum of lengths of rectifiable paths between x and y.
  • To obtain a notion of groups as geometric objects themselves, a word metric is usually used, and in order to define this G is assumed to be finitely-generated
  • In order for the group action to play nicely with the geometry, assumptions are usually made that are sometimes grouped under the umbrella term of “geometric action”: this almost always involves the action being by isometries (so that it preserves the geometry) and properly discontinuous (so that the quotient is still Hausdorff), and often but not always the action being co-compact.
    Below all of our group actions will be isometric and properly discontinuous.


The definition of a word metric starts with the choice of a generating set. This is a somewhat arbitrary choice, and thus the resulting metric is not quite an intrinsic object—a rather unsatisfying state of affairs. We would, after all, like to say things about the geometry of a group and try to relate those things to the algebraic and other properties of the group, rather than a specific presentation of the group.

To remedy this, we would like to declare that word metrics for the same group coming from different choices of generating sets are really equivalent. As it turns out, we can reach this identification geometrically by squinting a little, or slightly more precisely by taking coarse bi-Lipschitz equivalence, or, even more precisely, by declaring two metrics d_1, d_2 on \Gamma to be equivalent if there is a quasi-isometry f:(\Gamma, d_1) \to (\Gamma, d_2), i.e. a map satisfying \frac 1k d_1(x,y) - c \leq d_2(f(x), f(y)) \leq k d_1(x,y)+c for all x, y \in \Gamma.

One can show that quasi-isometry is an equivalence relation—the identity map is a quasi-isometry; compositions of quasi-isometries are quasi-isometries; there is a notion of a [coarse] quasi-inverse—and, then, that choosing a different generating set produces a equivalent (quasi-isometric) word metric.

The most common way of proving that two spaces are quasi-isometric is to produce an explicit quasi-isometry.

Using the orbit map as the explicit quasi-isometry (this takes some argument, but most of it can be fruitfully sketched in a diagram) leads to the Milnor-Švarc lemma, which states that any group \Gamma acting co-compactly on a geodesic metric space X, is quasi-isometric to (in the sense that some—and hence any—Cayley graph of \Gamma with the associated word metric is quasi-isometric to X.)

As a corollary, we may deduce that any two groups which act cocompactly on the same geodesic metric space are quasi-isometric to each other.

Quasi-isometric rigidity

Two relatively simple applications of the Milnor-Švarc lemma yield that finite subgroups and quotients of finitely-generated groups G are always quasi-isometric to G itself: for the former case we use the induced action of the subgroup on G; in the latter we case we consider the action of G on the quotient group. Thus commensurable finitely-generated groups are always quasi-isometric.

(This is also why geometric group theory is not the tool of choice for finite group theory—from this geometric point of view, finite groups are virtually trivial.)

Quasi-isometric groups are not always commensurable, but to produce (and prove) examples of this actually takes some work. Large families of familiar groups exhibit quasi-isometric rigidity, a phenomenon which may be loosely described as “quasi-isometry implies commensurability”, or sometimes just slightly more carefully as “quasi-isometry implies commensurability to another group in the same class.”

For instance, all hyperbolic surface groups—i.e. fundamental groups of closed oriented surfaces of genus at least 2—are quasi-isometric by Milnor-Švarc; since given any two such surfaces we can find a surface covering both surfaces with finite covering degree, they are also all (abstractly) commensurable.

Similarly, free groups are quasi-isometrically rigid, as are free abelian groups, nilpotent groups (although the most straightforward proof of this seems to use a beautiful but huge hammer), and so on.

One relatively easily-described example of pairs of quasi-isometric groups which are not abstractly commensurable to each other comes from fundamental groups  \pi_1(T^1\Sigma) of unit tangent bundles of hyperbolic surfaces \Sigma, which are uniform lattices in \widetilde{\mathrm{PSL}_2\mathbb{R}}. These are quasi-isometric to central extensions \pi_1(\Sigma) \times \mathbb{Z}, but by the structure of \widetilde{\mathrm{PSL}_2\mathbb{R}} are not commensurable to these central extensions.

A broader class of examples comes from topological, geometric, dynamical or algebraic properties not preserved by quasi-isometry, such as solvability, residual finiteness, or property (T).

Quasi-isometric invariants

This last paragraph leads naturally to the question of what properties or invariants of a group, if any, are invariant under quasi-isometry.

These invariants are sought not just in order to produce examples of quasi-isometric rigidity, but also for the allied but broader reason that they point to the limits of the tools of geometric group theory. This is a toolkit which, in general, will only distinguish things up to quasi-isometry, and isn’t of much use in trying to tell apart two groups which are quasi-isometric to each other (see, e.g., the earlier comment about finite groups.)

There is a long list of such invariants and properties: they include both large-scale / coarse geometrical, topological and dynamical properties such as word-hyperbolicity (see below), the ends of a group, and amenability, as well as more quantitative features such as growth, isoperimetric, and divergence functions, and cohomological dimension.

Groups with geometry: the role of nonpositive curvature

Groups—or, technically, quasi-isometric equivalence classes of Cayley graphs with word metrics—may considered as geometric objects in their own right, an idea first systematically articulated by Gromov.

Here cocompact group actions play an important role, by allowing us to invoke Milnor-Švarc to go between any other space our group may naturally act on (e.g. hyperbolic space, or a metric tree) and its the Cayley graph.

The richest results have been obtained where we have some notion of nonpositive, or even better, negative, curvature.

Hyperbolic groups

The prototypical case is that of Gromov-hyperbolic (or word hyperbolic, or \delta-hyperbolic, or sometimes just hyperbolic) groups.

A geodesic metric space X is defined to be \delta-hyperbolic if geodesic triangles in are \delta-slim, i.e. any point on any side of the triangle is within \delta of the (union of the) other two sides. A finitely-generated group G is \delta-hyperbolic if it acts cocompactly on some \delta-hyperbolic space. We may verify that hyperbolicity—though not the hyperbolicity constant—is a quasi-isometry invariant, and then, by Milnor-Švarc, this is equivalent to any Cayley graph for G being Gromov-hyperbolic.

This simple metric geometry notion somehow captures the coarse essence of negative curvature, as we shall see briefly below.

A key property of \delta-hyperbolic spaces is quasigeodesic stability, sometimes referred to as the Morse Lemma: any quasigeodesic segment (the image of a geodesic segment under a quasi-isometry) is uniformly close, with constants depending only on the quasi-isometry (k,c) and the hyperbolicity constant \delta, to an actual geodesic with the same endpoints, and vice versa.

Algorithmic and finiteness properties

Using the \delta-hyperbolicity condition, the triangle inequality, and quasigeodesic stability, we may show that k-local geodesics (i.e. rectifiable paths whose restrictions to segments of length at most k are geodesic) are global geodesics for any k > 8\delta.

This in turn gives rise to Dehn presentations for \delta-hyperbolic groups, which allow us to solve the word problem for these groups in “linear” time (or more precisely, in linearly many steps in the length of the word, although each step may itself take non-constant polynomial time.)

Hyperbolic groups also have good algorithmic properties beyond having solvable word problem: for instance, it may be shown that conjugate elements in a hyperbolic group are either representable by short words, or are conjugate by short words, and hence hyperbolic groups have solvable conjugacy problem; they are also geodesically biautomatic.

Somewhat surprisingly, the existence of Dehn presentations may be used to characterize hyperbolic groups—the reverse implication proceeds via the linear isoperimetric inequality (LIP) condition.

Dehn presentations can also be used to show that hyperbolic groups have finitely many conjugacy classes of finite-order elements; indeed, using the existence of coarse barycenters, we may show that hyperbolic groups have finitely many conjugacy classes of finite-order subgroups.

Negative curvature, coarsified

Besides coarse barycenters,  \delta_hyperbolic spaces admit appropriately coarsified versions of many negative-curvature features such as coarse projections, coarsely well-defined midpoints, and so on.

Quasigeodesic stability also allows us to construct boundaries for \delta-hyperbolic spaces and groups, which are in the first instance sets of directions at infinity. These sets may be naturally endowed with topologies; the resulting contraptions are known as Gromov boundaries, and are analogous to / generalisations of the boundary sphere of hyperbolic space.

They can be used to provide analogous proofs for results as the Tits alternative for subgroups of \delta-hyperbolic groups (cf. the Tits alternative for subgroups of \mathrm{SL}_2\mathbb{R}): isometries of \delta-hyperbolic spaces can be classified as elliptic, parabolic, or hyperbolic, in terms of translation lengths; the presence of two hyperbolic elements with disjoint fixed point sets allows us to play ping-pong; otherwise we may argue to obtain a virtually nilpotent group.

CAT(0) groups

If Gromov hyperbolicity is the appropriate notion of coarse negative curvature, it is less clear what makes for a good notion of coarse nonpositive curvature. One approach—which indeed applies more generally in the form of CAT(k) geometry—involves comparing geodesic triangles to geodesic triangles in corresponding model spaces (Riemannian manifolds) of constant sectional curvature. Specifically, CAT(0) spaces are spaces in which geodesic triangles are no fatter than corresponding comparison triangles in Euclidean space, and a CAT(0) group is one which acts co-compactly on some CAT(0) space.

Comparison geometry has a long and fine tradition, and CAT(0) spaces share many features of nonpositively curved spaces. Unfortunately, unlike hyperbolicity, the CAT(0) condition is not a quasi-isometry invariant, and in this sense being CAT(0) is rather less intrinsic of a property for a group.

CAT(0) spaces do still have nice properties: they are uniquely geodesic and contractible, for instance. CAT(0) groups have solvable word problem, and there is a fair amount we can say about isometries of CAT(0) spaces, aided by tools such as nearest-point retraction onto convex subspaces and well-defined barycenters. (That we do not see the word “coarse” here is one indication that this is in some ways a slightly different flavour of geometry compared to the Gromov-hyperbolic kind.) Analogous to the classification of isometries for hyperbolic spaces, we may classify isometries of CAT(0) spaces as elliptic, hyperbolic, or parabolic, depending on translation length (and whether it is achieved.)

We may define visual boundaries and ideal boundaries (also known as Tits boundaries) for CAT(0) groups in much the same way as we defined  for hyperbolic groups. However, here they are no longer quasi-isometry invariants; the homeomorphism type of the boundary we end up with in general could depend on the particular presentation we start with. This is not at all ideal, and there are any number of ways to remedy the situation; roughly speaking, the divergence between the various boundaries increases with how much flatness / zero curvature we see in our CAT(0) space.

In general, the pathologies that appear in CAT(0) geometry as compared to hyperbolic geometry are attributable to flat subspaces: quasi-flats—i.e. the images of flat subspaces under quasi-isometries—can be rather wild (here’s an exercise which illustrates one aspect of this: [singly-infinite, unbounded] logarithmic spirals are quasi-geodesics in the Euclidean plane.)

Indeed, the Flat Plane Theorem states that any proper co-compact CAT(0) space X (“co-compact” here meaning “admitting a co-compact geometric action by some group G“) is \delta-hyperbolic iff it does not contain a flat plane.

There is, however a fair amount we can say about flat subspaces in CAT(0) spaces and how they are reflected in the structure of CAT(0) groups. Starting from the Alexandrov Lemma (in some sense, a particular combination of the CAT(0) inequality, the triangle inequality and the law of cosines), we may use angle conditions to find flat triangles, flat quadrilaterals, and flat strips with product decompositions. Using the product decomposition theorem and other properties of hyperbolic isometries, we may establish the Flat Torus Theorem, which states (roughly speaking, in one formulation) that if a free abelian group of rank n acts co-compactly on a CAT(0) space X, then we can find a flat n-torus in the quotient of X by that action.

Cubulated groups

What if our groups acted not just on any old CAT(0) space, but on very particular CAT(0) spaces, which had tractable combinatorial structures on them? What if, even better, these particular CAT(0) spaces could be obtained in some natural way?

Affirmative answers to these questions motivate the study of cubulated groups—groups which act cocompactly on CAT(0) cube complexes, which are somewhat analogous to simplicial complexes, but with geometry: they are formed by gluing Euclidean cubes along faces by isometries.

Not all CAT(0) groups are cubulated, but constructing an explicit example is a non-trivial exercise (here’s one.)

Cube complexes have hyperplanes, which are (intuition shamelessly stolen from Teddy Einstein) roughly analogous to geodesics; the purely combinatorial structure of the hyperplanes can be used to gain additional insight into the structure of cubulated groups.

Special cube complexes, RAAGs, and Agol’s theorem

The analogy between hyperplanes and geodesics is closer in the case of special cube complexes, which are cube complexes whose hyperplanes do not exhibit any of the four hyperplane pathologies.

Equivalently, though less transparently, special cube complexes are those whose fundamental groups embed in a very special class of groups, the right-angled Artin groups, or RAAGs (these are Artin groups—see below—with the additional stipulation that all relations are commutators.)

In 2012, Ian Agol proved, building on work of Wise and Groves-Manning, that every cubulated hyperbolic 3-manifold group is special, or in other words embeds in a RAAG. In combination with Kahn-Markovic’s resolution of the surface subgroup conjecture and the Haglund-Wise-Sageev construction for cubulating hyperbolic 3-manifold groups, this proved the Virtually Haken Conjecture, and with further work of Agol and Wise involving special cube complex fundamental groups, the Virtually Fibered Conjecture. That proof, in its entirety, involved an extraordinary sweep of ideas, and in part demonstrates the mathematical value of geometric group theory ideas in general and cubulation in particular.

Relative and hierarchical hyperbolicity

There are various other related notions of “broken”, “damaged” or “restricted” negative curvature (none of these being technical terms here) floating around, inspired by various particular examples.

For instance, relative hyperbolicity generalises fundamental group of hyperbolic manifolds with geodesic boundary—these spaces are, roughly speaking, hyperbolic away from certain bad regions—, and hierarchical hyperbolicity is inspired by the refinement of coarse negative curvature seen in the mapping class group following Masur-Minsky theory, where regions of the space may be reasonably modeled by projections to families (“hierarchies”) of Gromov-hyperbolic spaces.