Articles, Overview / Outlines

Knot theory

Knots (links) are embeddings of the circle (some finite number of circles) into the 3-sphere, modulo ambient isotopy. Knot theory is thus the study of the essentially different ways we can embed a circle into a 3-sphere—a kind of deformation theory for such maps, if you will. From this perspective, it is perhaps not too surprising that knot theory has developed connections to many areas of pure and applied mathematics.

For example, knot complements and Dehn surgery are important sources of examples for 3-manifolds. Knot complements are intimately related to knots themselves, by the Gordon-Luecke theorem; Dehn surgery not only has the virtue of being explicitly computable, it is also a fairly generic source of 3-manifolds, by the Lickorish-Wallace theorem.

Or, for example, knot theory shows up in physics in various ways. Indeed, knot theory and physics tangled from fairly early on: most prominently, Lord Kelvin’s theory of “vortex atoms” posited that atoms can be modeled as knots of aether. While this turned out to be wrong from a physical standpoint, it did inspire Tait to start classifying knots and spur the mathematical development of knot theory; and it may yet find a spiritual successor in the “anyons” of topological quantum computing.

Yet another early connection can be seen in the linking number, one of the first knot (or link) invariants, which Gauss came up in his study of a question from electrodynamics: how much work is done on a magnetic pole moving along a closed loop in the presence of a loop of current? Using the Ampere and Biot-Savart law, he found the answer could be described in terms of how many times one of these loops winds around the other, or in other words the linking number of the two loops (see e.g. pp. 2-4 here.)

Knot invariants

More generally, knot (or link) invariants are mathematical objects—numbers, polynomials, homology groups, whatever have you—attached to knots (or links—I’ll stop writing this, but you should imagine inserted after each instance of “knot” below.) Equivalent (i.e. ambient-isotopic) knots should be assigned the same value—hence the name “invariant”—although certain pairs of non-equivalent knots may also be assigned equal values.

These are useful, primarily, for distinguishing knots—for providing a verifiable certificate that two knots are indeed inequivalent, as it were.

It is straightforward enough, in principle, to prove that two knots are equivalent: one simply exhibits an ambient isotopy taking one to the other. Indeed, this ambient isotopy can always be described within an explicitly-described standard combinatorial model—using a class of standard local “moves”, the Reidemeister moves, on knot diagrams, i.e. projections of the knot to a plane, decorated with additional information about which strand crosses “over” the other at each double point.

(Completely tangential aside: knot diagrams can be an endless source of combinatorial fun—see e.g. this REU paper on games on shadows of knots.)

It is rather harder to prove that two knots are inequivalent this way: failure to exhibit a suitable sequence of Reidemeister moves doesn’t prove that an ambient isotopy can’t exist—perhaps with more ingenuity one could in fact find the requisite Reidemeister moves? Hence the utility of knot invariants: if we can show that two knots take on different values for a certain knot invariant, that definitively shows that they are inequivalent.

Some examples of knot invariants include the crossing number—which can be defined as the minimum number of crossings (double points) which any knot diagram of a given knot must have—, the unknotting number—which can be defined as the least number of times one needs to pass a knot through itself (or, equivalently, the least number of crossings that need to be switched) to get to an unknot—, and the Seifert genus—the minimal genus of any connected oriented surface whose boundary is the knot.

These invariants are intuitive and natural measures of the “complexity” of a knot, but are notoriously hard to compute and study. For instance, it is conjectured, but still not known, that the crossing number is additive under taking connect-sums, and it took a surprising amount of work (an Inventiones paper!) to show that composite knots have unknotting number at least 2. In some sense, an understanding of these invariants can be seen as a primary aim, rather than a tool, of knot theory.

More computable, but in some ways less intuitive and more mysterious, are invariants of a more algebraic nature: things such as the signature or Arf invariant coming from naturally-associated quadratic forms, or various knot polynomials which package this and other knot data into their coefficients. The theory behind and extending from these knot polynomials, in particular has driven much of modern knot theory, and we touch on some of these developments below.

The turn to combinatorics (and algebra)

Starting with Conway’s work on the Alexander polynomial in the 1960s, it has been realized that many of these polynomials can be defined and computed combinatorially, using skein relations; building on this, and introducing language and ideas from statistical mechanics, Kauffman formulated state-sum models for computing the Alexander and Jones polynomials in the 1980s.

This was not the first time physics had popped up here, either: models from statistical mechanics motivated the von Neumann algebras and braid representations which were originally used to define the Jones polynomial. This was in many ways a pivotal development in the theory: to borrow The Unapologetic Mathematician’s words:

Jones was studying a certain kind of algebra when he realized that the defining relations for these algebras were very much like those of the braid groups. In fact, he was quickly able to use this similarity to assign a Laurent polynomial … to every knot diagram that didn’t change when two diagrams differed by a Reidemeister move. That is, it was a new invariant of knots.

The Jones polynomial came out of nowhere, from the perspective of the day’s knot theorists. And it set the whole field on its ear. From my perspective looking back, there’s a huge schism in knot theory between those who primarily study the geometry and the “classical topology” of the situation and those who primarily study the algebra, combinatorics, and the rising field of “quantum topology”. To be sure there are bridges between the two … But the upshot was that the Jones polynomial showed a whole new way of looking at knots and invariants.

Extending the skein relation approach to build invariants on a larger class of knots with singular points leads us to the theory of the much more powerful, and possibly fundamental finite-type (Vassiliev) invariants, which has subsequently been studied using such tools as chord diagrams and the Kontsevich integral.

Knot homology theories

Another, related direction in which knot polynomials have taken off is towards homological algebra, in a development sometimes described as “categorification”. Categorifying a knot polynomial involves building a homology whose Euler characteristic—in an appropriate sense, involving some sort of alternating sum of data from the homology groups—recovers the knot polynomial.

The homology groups of a topological space contain all the information carried by the Euler characteristic, and then some; in some sense, they are the more fundamental invariant, of which the Euler characteristic is just a “decategorified shadow”. Similarly, a knot homology theory should be, in principle at least, more powerful and fundamental than the polynomial it categorifies .

The two main (flavors of) knot homology are Khovanov homology and knot Floer homology. Khovanov homology, developed in the late 1990s by Mikhail Khovanov, is related to ideas in representation theory, and categorifies the Jones polynomial. Knot Floer homology,  developed by Ozsváth and Szabó and independently by Rasmussen, both in the early 2000s, is based off Heegaard Floer theory,  “a symplectic geometric replacement for gauge theory”, and categorifies the Alexander polynomial.

There are connections between these homology theories and Lie algebras, symplectic geometry, 3- and 4-dimensional topology, physics, and so on, about which I am entirely ignorant at the moment …

Quantum things (also, strings)

Quantum mechanics has already made a brief cameo above. Indeed, as hinted but not spelt out above, many pieces of mathematics which appear in quantum mechanics and related physical theories can be leveraged to create knot invariants—e.g. the Jones polynomial can be recovered from Chern-Simons theory, and the Kontsevich integral is inspired by perturbative Chern-Simons theory.

The connections run much deeper—it appears that knots and braids provide, or are an integral part of, some of the most compelling mathematical models for quantum phenomena, and even make aspects of their appearance felt in string theory. Now, as before, knot theory remains intertwined with physics, although now mediated through the elaborate structure of representation theory in its various guises.

Overview / Outlines

This Corner of the Field

Over the next four months or so*, I will be writing a series of blogposts which outline in broad, descriptive terms the area/s of mathematics I am most interested in—a large swathe of low-dimensional topology and its environs, for the most part—, some sets of tools which are useful to these areas, and how they relate to one another and to the rest of mathematics and to other bits of human inquiry.

The purpose of (writing) these posts is mostly to seek an at least mildly coherent answer to the question, “so what is it you do?” and the often-unasked but—at least to me—entirely natural follow-up, “but why?”

In order to do this I will attempt to describe just what each of these bits of mathematics is about, how it came about, why its originators found it interesting, and why I find it interesting. Along the way, as our dear doctoral chair would advise, there should be plenty of concrete illustrative examples. In some cases, I may also describe current directions of exploration and active research.

*A rough deadline … or more of a broad scheduling guideline. To some extent, I’m inclined to treat [large parts of] this as an ongoing project, with posts—particularly ones covering areas I am particularly focused on / interested in—subject to active and ongoing revision.

An outline of the ground to be covered, to be updated with links and edits as I go along:

Overview / Outlines

A functional analysis primer

(Again, no proofs–see e.g. Stein and Shakarchi’s Real Analysis and Functional Analysis.)

Function spaces

Functional analysis deals with spaces of functions. Typically these are infinite-dimensional vector spaces with some sort of norm (Banach spaces), or even better, inner product (Hilbert spaces), and the additional structure has good analytic properties, i.e. the [induced] norm is complete.

If we do not assume completeness, we get pre-Banach or pre-Hilbert spaces; these can be completed to Banach or Hilbert spaces, and the completion is unique up to isomorphism.

The prototypical infinite-dimensional Hilbert space is the space of square-summable sequences \ell^2(\mathbb{Z}), with the inner product \langle (a_n), (b_n) \rangle = \sum_n a_n b_n; in fact, this is universal (for separable Hilbert spaces), and any L^2(X,\mu) (with the inner product \langle f, g \rangle = \int_X fg \,d\mu) is unitarily equivalent by the Fourier transform.

It is less clear what the universal Banach space is, but the L^p spaces (with 1 \leq p \leq \infty) are good prototypical examples.

Many other function spaces of interest are also Banach spaces.

Dual spaces

Linear maps between Banach spaces are also called linear operators. Given a linear operator T, we define its sup norm by \|T\| = \sup_{\|f\| = 1} \|T(f)\|. Operators with finite sup norm are called bounded. A linear operator T is bounded iff it is continuous.

Given any Banach space X, we can form the dual space X^* of all bounded linear functionals (i.e.. This is a Banach space with the sup norm \|\ell\| = \sup_{\|f\| = 1} \|\ell(f)\|.

Note that a linear functional (more generally, any linear map between Banach spaces) is bounded iff it is continuous.

Some examples

The dual space of L^p(X) for 1 \leq p < \infty is L^q(X), where q is the dual exponent which satisfies \frac 1p + \frac 1q = 1. Note (L^\infty)^* \supsetneq L^1—e.g. it contains the Dirichlet delta functionals, which are not representable by L^1-functions.

The dual space of the space of continuous functions C(X) is the space of all signed measures on X.

Hilbert spaces are self-dual, by the Riesz representation theorem. More precisely, given any continuous linear functional \ell on a Hilbert space \mathcal{H}, there exists a unique g = g(\ell) \in \mathcal{H} s.t. \ell(f) = \langle f, g \rangle for all f \in \mathcal{H}, with \|\ell\| = \|g\|, and this gives us an identification \mathcal{H}^* \to \mathcal{H} defined by \ell \mapsto g(\ell).

Building linear functionals

The Hahn-Banach theorem states that any linear functional defined on a linear subspace V_0 \subset V and bounded above by a sublinear function p on V_0 can be extended to a linear functional bounded above by p on the entire space V.

This allows us to define linear functionals by specifying values on some subspace, and then extending them using the general theorem.

Some applications / consequences:

  • convex subsets can be separated from points in complement of their closures
  • the natural injection V \to V^{**} into the double dual is isometric
    (L^\infty)^* \supsetneq L^1: the construction of the Dirac delta functionals uses the Hahn-Banach theorem

Topologies weak and strong

Note the unit ball in the norm topology is compact iff the space is finite-dimensional. In the infinite-dimensional case, there is no single “good” or “canonical” topology we could put on the space. There is the norm topology, which is in some sense natural if we are given a norm to start with; however it does not have the good properties that it does in the finite-dimensional case, whereas there are in fact some other topologies in which e.g. the unit ball is compact:

A sequence of points (x_n) in a Banach space \mathcal{B} is said to converge weakly to a point x \in \mathcal{B} if \ell(x_n) \to \ell(x) for every \ell \in \mathcal{B}^*. (If our Banach space is in fact a Hilbert space, then in view of the Riesz representation theorem we may re-express this condition as \langle x_n,y \rangle \to \langle x,y \rangle for every y \in \mathcal{B}.) The weak topology on \mathcal{B} is the (coarsest) topology for which sequences (or, more generally, nets) converge in the sense of weak convergence, or equivalently the coarsest topology in which bounded linear functionals remain continuous.

The closed unit ball in \mathcal{B} is compact iff \mathcal{B} is reflexive (this is the case if e.g. \mathcal{B} is a Hilbert space, or one of the L^p spaces.)

The weak* topology on the dual space \mathcal{B}^* is the coarsest topology such that the evaluation maps \varphi \mapsto \varphi(x) from \mathcal{H}^* to the base field remain continuous (for all x \in \mathcal{H}^*. It coincides with the topology of pointwise convergence of linear functionals.

The (sequential) Banach-Alaoglu theorem states that the closed unit ball of the dual space of a (separable) normed vector space is (sequentially) compact in the weak* topology (uses Tychonoff theorem, hence Choice.)

There are many other possible topologies, especially on the dual space, but that is a topic for another day.

Baire categories in Banach spaces

Baire’s two categories form a dichotomy of “size” based purely on topology, which is “in some sense a combination of [countability and density]” (quote adapted from Baire, as translated in Stein.)

First category (“meagre”) sets are countable unions of nowhere dense sets (i.e. sets whose closures have empty interior, such as discrete sets, or the Cantor set.) Complements of first category sets are generic. Anything which is not first category is second category.

Baire’s category theorem*states that any complete metric space X is of the second category (“the continuum is of the second category”.) One corollary of this is that generic sets are dense in a complete metric space (but note that there are also first category sets in [0,1] of full measure–e.g. [very] fat Cantor sets.)

Second category gives wriggle room

That wriggle room (usually in the form of “if we write a second category as a countable union of closed sets, at least one of them contains an open ball”) allows us to prove a bunch of useful analytic results for Banach spaces, which mostly extend our intuition for what happens in finite-dimensional spaces to the infinite-dimensional case:

  1. The uniform boundedness principle states that any set of continuous linear functionals on a Banach space \mathcal{B} which is pointwise bounded on some second category X \subset \mathcal{B} is uniformly bounded (i.e. bounded in the sup norm.)
  2. The open mapping theorem states that surjective continuous linear maps between Banach spaces are open mappings.
  3. The closed graph theorem states that linear maps between Banach spaces whose graphs are closed are continuous.

Hilbert space structures

The inner product that comes with a Hilbert space \mathcal{H} enables us to talk about orthogonal elements. We say that a (possibly infinite) tuple is an orthonormal basis for \mathcal{H} if it spans not necessarily the whole space, but a dense subspace.

e.g. (e^{inx})_{n=-\infty}^\infty is a orthonormal basis of L^2([-\pi, \pi]), by the theory of Fourier series.

Moreover, whenever we have a (topologically) closed subspace \mathcal{S} \subset \mathcal{H}, there is a well-defined notion of orthogonal projection onto \mathcal{S}, and thus (by subtracting the orthogonal projection) a well-defined orthogonal complement \mathcal{S}^\perp, which behave as we would expect it to from the case of finite-dimensional Hilbert spaces.


The inner product structure of a Hilbert space allows us to define these fun things called adjoints, which should be familiar from linear algebra: the adjoint of a linear operator T: \mathcal{H} \to \mathcal{H} is a linear operator T^*: \mathcal{H} \to \mathcal{H} satisfying \langle Tf, g \rangle = \langle f, T^*g \rangle for every f, g \in \mathcal{H}.

The construction of this adjoint goes through the Riesz representation theorem (see above), and so only works for operators from a Hilbert space to itself. With more care adjoints may be defined for operators between arbitrary pairs of Hilbert spaces (or even Banach spaces.) These adjoints really go between the dual spaces–in the first instance the distinction was blurred since Hilbert spaces are self-dual; without further assumptions, in the general case they may not be defined on the whole space and may not be unique.

Compact operators

A linear operator T: \mathcal{H} \to \mathcal{H} is compact if the image of the closed unit ball in \mathcal{H} under T is pre-compact (i.e. has [sequentially] compact closure.) Note compact operators are automatically bounded. “It turns out that dealing with compact operators provides us with the closest analogy to the usual theorems of (finite-dimensional) linear algebra.”

Some useful properties:

  • Pre- or post-composing a compact operator with a bounded operator yields a compact operator.
  • Limits of compact operators (in the sup norm) are compact.
  • Conversely, every compact operator is the limit of finite-rank operators (i.e. operators with finite-dimensional range.)
  • Compactness is preserved under taking adjoints.

Some useful examples:

  • Diagonalizable operators with eigenvalues |\lambda_k| \to 0
  • Hilbert-Schmidt operators

The Spectral Theorem for compact operators states that any compact symmetric operator T: \mathcal{H} \to \mathcal{H} has an orthonormal basis of eigenvectors, with top eigenvalue of norm \|T\|.

Spectral Theorem for bounded operators

There is a more general spectral theorem for bounded self-adjoint operators: given any bounded symmetric operator T: \mathcal{H} \to \mathcal{H}, there exists a measure space X and a real-valued f \in L^\infty(X) (representing the spectrum) s.t. A is unitarily conjugate to the “multiplication by f” operator on L^2(X) given by \varphi \mapsto (x \mapsto f(x)\varphi(x))

Alternatively this may be expressed in terms of a spectral resolution E_\lambda or projection-valued measure dE_\lambda, which allows us to write A = \int_{\sigma(A)} \lambda \, d E_\lambda.

In the case of compact operators the spectrum \sigma(A) is discrete (and the corresponding projection-valued measure a countable linear combination of atoms), and we recover the more specific statement above.

Overview / Outlines

A measure theory primer

(No proofs here: for proofs and/or details see e.g. Stein and Shakarchi’s Real Analysis)

Step 1: Measures

A (signed) measure \mu on a given space X is a way of determining (a measure for, as it were) how large a set is, or, more precisely, a function from subsets of X to the (extended) reals. Measures are non-negative (signed measures need not be), zero on the empty set, and countably additive on disjoint sets.

It is, in general, not possible to assign a measure to every subset of an arbitrary X in a way consistent with these axioms (see: existence of non-measurable sets); hence to fully specify a measure  space we need one more piece of data, the set of subsets of X which are measurable. These form a tribe\sigma-algebra: they contain X, and are closed under taking complements and countable unions (and hence, by De Morgan’s laws, also closed under countable intersections.)

The word “countable” in all of the above is important! If we replace it with “finite” we obtain the weaker notion of Jordan content; if we replace it with “arbitrary” we get limp hogwash (any set is the disjoint union of its points; if we further assume some sort of translation-invariance, i.e. all singleton sets have the same measure, this implies either any infinite set has infinite measure, or every set has zero measure.)

Important examples include

  • the counting measure
  • the Lebesgue measure on \mathbb{R}^d is the complete translation-invariant measure on the σ-algebra containing the closed cubes with \mu([0, 1]^d) = 1
  • the Haar measure on a locally-compact topological group is a common generalization of the Lebesgue measure and the counting measure, with similar uniqueness properties

Construction of the Lebesgue measure

  1. Closed intervals \prod_{n=1}^d [a_n, b_n] are assigned measure \prod_{n=1}^d |b_n - a_n| (Note this follows from our normalization, translation-invariance, and countable additivity.)
  2. To an arbitrary set E we assign the (Lebesgue) exterior measure \mu_*(E) = \inf \sum_{j=1}^\infty |Q_j|, where the inf ranges over all countable coverings of E by closed cubes. Again—“countable” is key here.
  3. A set is measurable if it differs from some open set(s) in a difference set of arbitrarily small exterior measure (more precisely: for any \epsilon > 0, we have an open set \mathcal{O}, which depends on \epsilon, s.t. \mu_*(\mathcal{O} \setminus E) < \epsilon.)
  4. Define the measure of a measurable set to be its exterior measure.
  5. Check that the resulting measure is indeed a measure (i.e. do the book-keeping to verify that the result is countably additive on disjoint sets.)

Carathéodory extension

A similar procedure can be applied more generally: given a space X on which we would like a measure, we start with some small reasonable family of subsets of X on which we can agree how to determine size / measure (in the case above, the closed cubes), and then attempt to extend this toddler measure (technically, a premeasure) to a measure on some larger \sigma-algebra of subsets. \epsilon-more precisely:

  1. The “small reasonable family” should be an algebra, i.e. non-empty, and closed under complements, finite unions and finite intersections.
  2. A premeasure assigns (non-negative extended) reals to sets in our algebra. Premeasures should be zero on the empty set and countably additive on disjoint sets.
  3. Given a premeasure \mu_0 on an algebra \mathcal{A}, we may form an exterior measure—a function \mu_* that assigns a (non-negative extended) real to any subset of X—by taking \mu_*(E) = \inf \sum_{j=1}^\infty \mu_0(E_j), where the inf ranges over all coverings of E by sets in \mathcal{A}.
  4. Axiomatically, exterior measures should be zero on the empty set, non-decreasing (if E_1 \subset E_2, then $latex  \mu_*(E_1) \leq \mu_*(E_2)$), and countably subadditive.
  5. Now we come to a key idea of Carathéodory: whereas in the construction of the Lebesgue measure we leaned heavily on the open sets in \mathbb{R}^d, we can formulate a criterion for measurability which does not refer to any topology on X, by declaring that a set E is measurable if \mu_*(A) = \mu_*(E \cap A) + \mu_*(E^c \cap A) for every A \subset X—i.e. if E divides any part of the space up in a reasonable enough way, as seen by the exterior measure.
  6. It is then straightforward to check that the set of all such sets forms a \sigma-algebra which contains \mathcal{A}, and our exterior measure restricted to this \sigma-algebra satisfies the axioms for a measure. By construction, this measure agrees with our premeasure on \mathcal{A}

The Carathéodory extension theorem states that, starting with any premeasure \mu_0 on any algebra of sets in X, one can form a measure \mu extending \mu_0, in the sense above, by following the process above.

Moreover, if X is \sigma-finite, i.e. it is the union of countably many pieces of finite measure (according to \mu), then this extension is unique.


How do different measures on the same space relate? Lebesgue (for the real line) and Radon-Nikodym (in the general case) tell us that the relation is, in some way, as nice and controlled as it could be.

Given any \sigma-finite positive measure \mu on a measure space X, any \sigma-finite (signed) measure \nu on may be decomposed into a piece  \nu_aabsolutely continuous w.r.t. \mu (i.e. \nu_a(E) = 0 iff \mu(E) = 0) and a piece \nu_s mutually singular with \mu, i.e. the two measures have disjoint supports.

Moreover the first piece may be written in the form d\nu_a = f \,d\mu for some extended \mu-integrable function f. (For notions of measurable functions and their integration, see below.)

Step 2: Maps

Measuring sets is all very well … but we also want our notion of measure to play nicely with maps between spaces, and this leads us to the idea of measurable functions. These are functions where measurable sets in the target space have measurable preimages in the domain space.

Note “measurable” here is with respect to the respective \sigma-algebras. In particular, if the target is a topological space, it is assumed, unless otherwise specified, to be equipped with the Borel \sigma-algebra, i.e. the smallest \sigma-algebra containing all of the open sets (which is smaller than the family of Lebesgue-measurable sets for \mathbb{R}^d, for instance.)

Littlewood’s three principles

In short: “weird things only ever happen in a vanishingly small period of time”:

  1. Every measurable set of finite measure is nearly a finite union of intervals: given such a set E for any \epsilon > 0 there exists a finite union of closed cubes s.t. \mu(E \Delta F) \leq \epsilon.
  2. Every measurable function is nearly continuous, i.e. for any \epsilon > 0, f|_{A_\epsilon} is continuous for some closed A_\epsilon with \mu(E \setminus A_\epsilon) < \epsilon (Lusin’s theorem.)
    Note that this states the restricted function is continuous as a function
  3. Every convergent sequence of measurable functions is nearly uniformly convergent, i.e. for any \epsilon > 0 is uniformly convergent on some closed A_\epsilon with \mu(E \setminus A) < \epsilon (Egorov’s theorem.)

The above are formulated for Lebesgue measure on \mathbb{R}. More generally:

  1. applies in any measure space with a measure constructed by Carathéodory extension, with “closed cube” replaced by “element of \mathcal{A}“.
  2. requires the domain and target spaces to be topological spaces for continuity to make sense, and the result further requires the target to be second-countable, and the domain to be Hausdorff and equipped with a Radon measure.
  3. requires the target to be a metric space for the idea of uniform convergence to make sense, and also requires target to be separable.

Step 3: Integrals

Once we have measurable functions it doesn’t take very long before somebody starts talking about trying to integrate them. Because measure theory is all about how big things are, and an integral is essentially (some global measure of ) “how big a function is.” Okay, enough with this turbo vagueness already—Eli Stein does a much better job in his preface, anyway.

In the below we assume the maps are functions into the reals; more generally the target space may be any separable metric space without too much change to the statements …

Construction of the Lebesgue integral

  1. The characteristic function \chi_E of a measurable set E is declared to have integral equal to the measure of the set: \int_X \chi_E \,d\mu = \mu(E).
  2. Extend the definition of the integral to simple functions, i.e. linear combinations of characteristic functions, by linearity.
  3. Extend the integral to all non-negative functions, by considering any such function f as a limit of simple functions \varphi_n, and letting the integral of f be the limit of the integrals of \varphi_n.
  4. Extend the integral to all (measurable) functions by decomposing any such function f into positive and negative parts f = f_+ - f_-.

The same process works, more generally, for any \sigma-finite measure space

Fatou and friends

Very useful for proving results with the Lebesgue integral (starting with how it’s well-defined)

  1. Fatou’s lemma: for (f_n) a sequence of non-negative measurable functions, \int \liminf_{n \to \infty} f_n \,d\mu \leq \liminf_{n \to \infty} f_n \,d\mu.
  2. Monotone convergence: for (f_n) a sequence of non-negative measurable functions with f_n \nearrow f, $\lim_{n \to \infty} \int f_n \,d\mu = \int f \,d\mu$
  3. Dominated convergence: for (f_n) a sequence of measurable functions with f_n \to f a.e. and |f_n| \leq g for some integrable g, we have \int |f-f_n| \,d\mu \to 0 and so \int f_n \,d\mu \to \int f\,d\mu as n \to \infty.
  4. Much approximation. Wow. The simple functions, step functions, and continuous functions of compact support are dense in the space of integrable functions.

Fubini’s theorem

Given two measures \mu_1 and \mu_2 on X_1 and X_2 (resp.), we can form a product measure \mu = \mu_1 \times \mu_2 by defining the premeasure (not measure!) \mu(A \times B) = \mu_1(A) \mu_2(B) for all \mu_1-measurable A and \mu_2-measurable B, and then extending this to a measure using Carathéodory extension.

Fubini’s theorem then tells us that integrating against the product measure \mu is the same as integrating against each of the factor measures \mu_i in turn (in either order.)

Whither the Fundamental Theorem of Calculus

The Lebesgue density theorem states that, for any locally integrable f on \mathbb{R}^d, \lim_{m(B) \to 0, B \ni x} \frac{1}{m(B)} \int_B f(y) \,dy = f(x) for almost every x. 

As a corollary: recalling the definition of the derivative, this says taking the Lebesgue integral of any integrable function f and then differentiating will recover the original function f—this is one direction of the Fundamental Theorem.

In the opposite direction: if F is absolutely continuous on [a, b], then F’ exists a.e. and is integrable, and satisfies F(x) - F(a) = \int_a^x F'(y) \,dy for all x \in [a,b]. Absolutely continuity may appear to be an additional hypothesis, but its necessity is clear when we observe that functions that arise as indefinite integrals (i.e. those of the form x \mapsto \int_a^x f(y) \,dy with f an integrable function) are absolutely integrable.

Overview / Outlines

(Some) Lie theory for geometric topology / dynamics

(following an outline by Wouter van Limbeek; filling this outline is a work-in-progress. Sections 6 and 7 are particularly incomplete; sections 11 and 12—or 8 onwards—could [should?] be split to form its own post.)

Here we are working with connected Lie groups.

1) Classes of Lie groups: abelian, nilpotent, solvable, semisimple

Abelian Lie groups are completely classified: these all have the form G \cong \mathbb{R}^k \times \mathbb{T}^e (where \mathbb{T}^e denotes the e-dimensional torus.)

Nilpotent groups are those with a terminating lower central series G_{(k+1)} = [G, G_{(k)}]. Examples of nilpotent Lie groups: Heisenberg groups, unitriangular groups, the (3-by-3) Heisenberg group mod its center.

Solvable groups are those with a terminating upper central series G^{(k+1)} = [G^{(k)}, G^{(k)}]. There is an equivalent formulation in terms of composition series—in this sense solvable groups are built up from (simple) abelian groups. Examples of solvable Lie groups:

  • \mathrm{Sol} = \mathbb{R}^2 \rtimes \mathbb{R}, where t \in \mathbb{R} acts on \mathbb{R}^2 as the matrix \left( \begin{array}{cc} e^t \\ & e^{-t} \end{array} \right).
  • \mathrm{Aff}(\mathbb{R})^2 \cong \mathbb{R} \rtimes \mathbb{R}_{> 0}, where the action is by scalar multiplication.
  • The group of all upper-triangular matrices.
  • \mathbb{R}^2 \rtimes \mathbb{R}, lifted from \mathbb{R}^2 \rtimes \mathrm{SO}(2), where the action is by rotation.

By definition, we have abelian \subset nilpotent \subset solvable

Simplicity and semisimplicity

A Lie algebra \mathfrak{g} is simple if it has no proper ideals (i.e. proper subspaces closed under the Lie bracket.) \mathfrak{g} is semisimple if it is a direct sum of simple Lie algebras.

A Lie group G is (semi)simple if its Lie algebra \mathfrak{g} is (semi)simple.

Equivalently, G is simple if it has no nontrivial connected normal subgroups (what can go wrong if we don’t have that additional adjective?)

G is semisimple if its universal cover \tilde{G} is a direct product of simple Lie groups. (Note that things can go wrong if we do not take the universal cover, e.g. the quotient of \mathbb{SL}_2(\mathbb{R}) by \mathbb{Z}/2\mathbb{Z}, where the action is the diagonalisation of \pm 1 \curvearrowright \mathbb{SL}_2(\mathbb{R}), is semisimple [in the sense that its Lie algebra is semisimple], but not a product.)

Relation between these types and the adjoint representation.

Recall the adjoint representation \mathrm{Ad}: G \to \mathrm{GL}(\mathfrak{g}) given by sending a group element g to the matrix representing D_e c_g, where c_g denotes conjugation by g. Now

  • G is abelian iff Ad(G) is the trivial representation
  • G is nilpotent iff Ad(G) is unipotent
  • G is solvable iff Ad(G) is (simultaneously) triangularizable (over an algebraically-closed field)
  • G is semisimple iff Ad(G) is a semisimple representation (i.e. may be written as a direct sum of irreducible Ad-invariant representations.)
  • G contains a lattice (see below) iff Ad(G) \subset \mathrm{SL}(\mathfrak{g}).

2) Levi decomposition of Lie groups

Note that subgroups of solvable groups are solvable, and extensions of solvable groups by solvable groups are solvable. Putting all of these together, we obtain that any Lie group G has a unique maximal closed connected normal solvable subgroup R (called the solvable radical of G.)

The Levi decomposition (also called, in some contexts, the Levi-Malcev decomposition) of G is G = RS, where R is the solvable radical, S \subset G is semisimple, and R \cap S is discrete.

If G is simply-connected, then G = R \rtimes S.

3) Classification of compact Lie groups

Any compact connected solvable Lie group G is a torus (Proof sketch: Induct on length of upper central series. For n=1, G is abelian and we are done. Otherwise, we have the short exact sequence 1 \to [G,G] \to G \to \frac{G}{[G,G]} \to 1; the outer terms are tori T_1 and T_2, and we have a map T_2 \to \mathrm{Out}(T_1) = \mathrm{GL}(\dim T_1, \mathbb{Z}). Since this is a group homomorphism from a connected group to a discrete group, it musth ave trivial image; hence T_1 \subset G is central, and we can induce on \dim T_2 to produce a section \frac{G}{[G,G]} \to G.

Compact connected semisimple Lie groups can also be classified, but this story involves much more algebra (representation theory, highest weights, etc.

Using the Levi decomposition, we have the following

Theorem. Any compact connected Lie group G is isomorphic to (\mathbb{T}^k \times G_1 \times \dots \times G_n) / F, where F is a finite group and G_1, \dots, G)n are simple.

4) When is a Lie group an algebraic group? When is a Lie group linear?

An algebraic group is a group with a compatible algebraic variety structure (i.e. multiplication and inversion are regular maps.) A linear group is (isomorphic to) a subgroup of GL(n,k) for some field k. Note GL(n,k) is algebraic, and thus linear groups defined by polynomial equations (but not all linear groups!) are algebraic.

A Lie group is algebraic if it is isomorphic to a linear algebraic group (but this is not an iff [?])

Many of the classical Lie groups are (linear) algebraic groups, but note that not all Lie groups are algebraic: e.g. \widetilde{\mathrm{SL}_2(\mathbb{R})} is not algebraic because its center is not algebraic (“is too large to be algebraic.”)

5) When is the exponential map a diffeomorphism?

We have an exponential map \exp: \mathfrak{g} \to G. Its derivative at the identity, \exp_*: \mathfrak{g} \to \mathfrak{g}, is the identity map; exp therefore restricts to a diffeomorphism from some neighborhood of 0 in \mathfrak{g} to a neighborhood of 1 in G.

If G is connected, simply-connected, and nilpotent, the exponential map exp is a (global) diffeomorphism (in fact, an analytic isomorphism of analytic manifolds, if G is linear algebraic.)

6) Topology of Lie groups: fundamental group, homotopy type, cohomology …

Theorem. Any connected Lie group has abelian fundamental group.

Proof sketch: In fact this is true for any connected topological group, because the group structure forces things to be nice that way. To make this precise can be mildly annoying though.

Theorem (Weyl). The fundamental group of a compact semisimple Lie group is finite

Theorem. Any connected Lie group has trivial second homotopy and torsionfree third homotopy.

Proof sketch: Since any Lie group retracts onto its maximal compact subgroup, WLOG we work with only connected groups.

From the long exact sequence of homotopy groups obtained from the path-loop fibration, \pi_k(G) \cong \pi_{k-1}(\Omega G). By Morse theory, \pi_1(\Omega G) = 0 and \pi_2(\Omega_G) = \mathbb{Z}^t.

Many more things can be said: see e.g. this survey.

Relation with the Lie algebra

7) Geometry of Lie groups: relation between geometry of an invariant metric and algebraic structure.

Note that a Lie group structure yields a natural left-invariant (or right-invariant) metric, given by propagating the inner product at the identity by group multiplication. With additional hypotheses, we can make this bi-invariant:

Theorem. Every compact Lie group G admits a bi-invariant metric.

Idea of proof Use the natural Haar measure on G to average the left-invariant metric.

We can play this invariant metric and the group structure off against each other: e.g. the geodesics of G coincide with the left-translates of 1-parameter subgroups of (at the identity); as a corollary of this, we obtain that any Lie group G is (geodesically) complete.

8) Selberg’s lemma

Lemma (Selberg 1960). A finitely generated linear group over a field of zero characteristic is virtually torsion-free.
(cf. Theorem (Malcev 1940). A finitely generated linear group is residually finite.)

Proof: Using number fields (local fields, see Cassels or Ratcliffe), or Platonov’s theorem, which seems to be a bunch of commutative algebra (see Bogdan Nica’s paper.)

9) Finite generation/presentation of lattices: Milnor-Svarc and Borel-Serre.

A lattice is a discrete group \Lambda \leq G with G / \Lambda of finite volume (as measured by the natural Haar measure on G.) The prototypical example is \mathrm{SL}(2,\mathbb{Z}) \subset \mathrm{SL}(2, \mathbb{R}), which of course is the isometry group of the torus.


  1. Any lattice in a solvable group is co-compact (also called uniform.)
  2. (Borel, Harish-Chandra) Any noncompact semisimple group contains a lattice that is not co-compact (also called non-uniform.)

Theorem (Milnor-Svarc). If \Gamma acts on a proper geodesic metric space X properly discontinuously and cocompactly, then \Gamma is finitely-generated.

Proof: Since \Gamma \curvearrowright X cocompactly, the action has a compact fundamental domain Z = X / \Gamma. By the proper discontinuity of the action, the vertices of Z cannot accumulate. But compactness then implies Z has finitely many vertices, and hence finitely many sides. Since each algebraically independent generator would add a side to Z, this implies that \Gamma is finitely generated.

In fact, many non-uniform lattices are also finitely-generated.

The \mathbb{R}-rank of a Lie group G is the dimension of the largest abelian subgroup of Z(a) simultaneously diagonalizable over \mathbb{R}, as a varies over all semisimple elements of G (a \in G is semisimple if Ad(a) is diagonalizable over \mathbb{R}. Geometrically, we may interpret the \mathbb{R}-rank as the dimension of the largest flat in G.

A locally-compact group G is said to have property (T) if for every continuous unitary representation \rho: G \to \mathcal{U}(\mathcal{H}) into some Hilbert space there exists \epsilon > 0 and compact L \subset G s.t. if \exists v \in \mathcal{H} with \|v\| = 1 s.t. \|\rho(l) v - v\| < \epsilon for all l \in L, then \exists v' \in \mathcal{H} with \|v'\| = 1 s.t. \rho(G) fixes v‘. (i.e., the existence of an almost-invariant vector implies the existence of a fixed point.)

Theorem (Kazhdan, 1968): If G is a simple Lie group of \mathbb{R}-rank \geq 2, and \Gamma \subset G is a lattice, then \Gamma has property (T), and hence is finitely-generated.

Alternative proof: every such lattice is arithmetic, then use a fundamental domain and argue [as before]

10) Levi-Malcev decomposition of lattices

The Levi-Malcev decomposition as applied to a lattice \Lambda \subset G = RS tells us that \Lambda = \Psi \Sigma where \Psi = \Lambda \cap R is a lattice in a solvable group and \Sigma = \Lambda \cap S is a lattice in a semisimple group: note that each of these intersections remains discrete, and has finite co-volume for if not \Lambda would not have finite co-volume.

Hence the general scheme for understanding lattices in Lie groups: first understand lattices in solvable groups and in semisimple groups, then piece them together using the Levi-Malcev decomposition …

11) (Non-)arithmetic lattices. When is a lattice arithmetic?

Most generally, an arithmetic subgroup of a linear algebraic group G defined over a number field K is a subgroup Γ of G(K) that is commensurable with G(\mathcal{O}), where \mathcal{O} is the ring of integers of K.

Hence we may define more abstractly a lattice \Gamma in a connected (solvable) Lie group G is said to be arithmetic if there exists a cocompact faithful representation i of G into G^*_{\mathbb{R}} (where G^* \subset \mathrm{GL}(n, \mathbb{C}) is an algebraic subgroup) with closed image and s.t. i(\Gamma) \cap G^*_{\mathbb{Z}} has finite index in both \Gamma and G^*_{\mathbb{Z}}.

We have that

Theorem (Borel, Harish-Chandra). Arithmetic subgroups are lattices.


Theorem (Mostow). (4.34 in Raghunathan.) Let G be a simply-connected solvable Lie group and \Gamma \subset G a lattice. Then G admits a faithful representation into \mathrm{GL}(n, \mathbb{R}) which sends \Gamma into \mathrm{GL}(n, \mathbb{Z}).

Note, however, the counterexample on pp. 76-77 of Raghunathan.

Much stronger results are true in semisimple Lie groups:

Theorem (Margulis arithmeticity). Any irreducible lattice in a semisimple Lie group with no rank one factors is arithmetic.

(More precisely, see statement of Selberg’s conjectures in Section 7 of this article.)

Theorem (Margulis’ commensurator criterion.) Let \Gamma < G be an irreducible lattice (where G may have rank 1). Then \Gamma is arithmetic iff the commensurator of \Gamma is dense in G.

Recall the commensurator of \Gamma < G is the subset of G consisting of elements g s.t. \Gamma and g \Gamma g^{-1} are commensurable

12) Rigidity of lattices

(i.e. when can you deform a lattice in the ambient Lie group?)

Theorem (Weil local rigidity). Let \Gamma be a finitely generated group, G a Lie group and \pi: \Gamma \to G be a homomorphism. Then \Gamma is locally rigid if H^1(\Gamma, \mathfrak{g}) = 0. Here \mathfrak{g} is the Lie algebra of G and \Gamma acts on \mathfrak{g} by \mathrm{Ad}_G \circ \pi.

Theorem (Mostow rigidity). Let Γ and Δ be discrete subgroups of the isometry group of \mathbb{H}^n (with n > 2) whose quotients \mathbb{H}/\Gamma and \mathbb{H}/\Delta have finite volume. If Γ and Δ are isomorphic as discrete groups, then they are conjugate.

i.e. lattices in hyperbolic space are pretty darn rigid. But then there’s even more. A lattice is caled irreducible if no finite index subgroup is a product;

Theorem (Margulis superrigidity). Let Γ be an irreducible lattice in a connected semisimple Lie group G of rank at least 2, with trivial center, and without compact factors. Suppose k is a local field.

Then any homomorphism π of Γ into a noncompact k-simple group over k with Zariski dense image either has precompact image, or extends to a homomorphism of the ambient groups.

Remark. Margulis superrigidity implies Margulis arithmeticity (how?)

Margulis’ normal subgroup theorem. Let G be a connected semisimple Lie group of rank > 1 with finite center, and let \Gamma < G be an irreducible lattice.

If N \leq \Gamma is a normal subgroup of \Gamma, then either N lies in the center of G (and hence \Gamma is finite), or the quotient \Gamma / N is finite.

Or, in short: “many lattices in semisimple Lie groups are simple, up to finite error.”

Some comments here

Quasi-isometric rigidty of lattices

See survey by Benson Farb

Zariski tangent space to representation variety and cohomology.