Overview / Outlines

A measure theory primer

(No proofs here: for proofs and/or details see e.g. Stein and Shakarchi’s Real Analysis)

Step 1: Measures

A (signed) measure \mu on a given space X is a way of determining (a measure for, as it were) how large a set is, or, more precisely, a function from subsets of X to the (extended) reals. Measures are non-negative (signed measures need not be), zero on the empty set, and countably additive on disjoint sets.

It is, in general, not possible to assign a measure to every subset of an arbitrary X in a way consistent with these axioms (see: existence of non-measurable sets); hence to fully specify a measure  space we need one more piece of data, the set of subsets of X which are measurable. These form a tribe\sigma-algebra: they contain X, and are closed under taking complements and countable unions (and hence, by De Morgan’s laws, also closed under countable intersections.)

The word “countable” in all of the above is important! If we replace it with “finite” we obtain the weaker notion of Jordan content; if we replace it with “arbitrary” we get limp hogwash (any set is the disjoint union of its points; if we further assume some sort of translation-invariance, i.e. all singleton sets have the same measure, this implies either any infinite set has infinite measure, or every set has zero measure.)

Important examples include

  • the counting measure
  • the Lebesgue measure on \mathbb{R}^d is the complete translation-invariant measure on the σ-algebra containing the closed cubes with \mu([0, 1]^d) = 1
  • the Haar measure on a locally-compact topological group is a common generalization of the Lebesgue measure and the counting measure, with similar uniqueness properties

Construction of the Lebesgue measure

  1. Closed intervals \prod_{n=1}^d [a_n, b_n] are assigned measure \prod_{n=1}^d |b_n - a_n| (Note this follows from our normalization, translation-invariance, and countable additivity.)
  2. To an arbitrary set E we assign the (Lebesgue) exterior measure \mu_*(E) = \inf \sum_{j=1}^\infty |Q_j|, where the inf ranges over all countable coverings of E by closed cubes. Again—“countable” is key here.
  3. A set is measurable if it differs from some open set(s) in a difference set of arbitrarily small exterior measure (more precisely: for any \epsilon > 0, we have an open set \mathcal{O}, which depends on \epsilon, s.t. \mu_*(\mathcal{O} \setminus E) < \epsilon.)
  4. Define the measure of a measurable set to be its exterior measure.
  5. Check that the resulting measure is indeed a measure (i.e. do the book-keeping to verify that the result is countably additive on disjoint sets.)

Carathéodory extension

A similar procedure can be applied more generally: given a space X on which we would like a measure, we start with some small reasonable family of subsets of X on which we can agree how to determine size / measure (in the case above, the closed cubes), and then attempt to extend this toddler measure (technically, a premeasure) to a measure on some larger \sigma-algebra of subsets. \epsilon-more precisely:

  1. The “small reasonable family” should be an algebra, i.e. non-empty, and closed under complements, finite unions and finite intersections.
  2. A premeasure assigns (non-negative extended) reals to sets in our algebra. Premeasures should be zero on the empty set and countably additive on disjoint sets.
  3. Given a premeasure \mu_0 on an algebra \mathcal{A}, we may form an exterior measure—a function \mu_* that assigns a (non-negative extended) real to any subset of X—by taking \mu_*(E) = \inf \sum_{j=1}^\infty \mu_0(E_j), where the inf ranges over all coverings of E by sets in \mathcal{A}.
  4. Axiomatically, exterior measures should be zero on the empty set, non-decreasing (if E_1 \subset E_2, then $latex  \mu_*(E_1) \leq \mu_*(E_2)$), and countably subadditive.
  5. Now we come to a key idea of Carathéodory: whereas in the construction of the Lebesgue measure we leaned heavily on the open sets in \mathbb{R}^d, we can formulate a criterion for measurability which does not refer to any topology on X, by declaring that a set E is measurable if \mu_*(A) = \mu_*(E \cap A) + \mu_*(E^c \cap A) for every A \subset X—i.e. if E divides any part of the space up in a reasonable enough way, as seen by the exterior measure.
  6. It is then straightforward to check that the set of all such sets forms a \sigma-algebra which contains \mathcal{A}, and our exterior measure restricted to this \sigma-algebra satisfies the axioms for a measure. By construction, this measure agrees with our premeasure on \mathcal{A}

The Carathéodory extension theorem states that, starting with any premeasure \mu_0 on any algebra of sets in X, one can form a measure \mu extending \mu_0, in the sense above, by following the process above.

Moreover, if X is \sigma-finite, i.e. it is the union of countably many pieces of finite measure (according to \mu), then this extension is unique.


How do different measures on the same space relate? Lebesgue (for the real line) and Radon-Nikodym (in the general case) tell us that the relation is, in some way, as nice and controlled as it could be.

Given any \sigma-finite positive measure \mu on a measure space X, any \sigma-finite (signed) measure \nu on may be decomposed into a piece  \nu_aabsolutely continuous w.r.t. \mu (i.e. \nu_a(E) = 0 iff \mu(E) = 0) and a piece \nu_s mutually singular with \mu, i.e. the two measures have disjoint supports.

Moreover the first piece may be written in the form d\nu_a = f \,d\mu for some extended \mu-integrable function f. (For notions of measurable functions and their integration, see below.)

Step 2: Maps

Measuring sets is all very well … but we also want our notion of measure to play nicely with maps between spaces, and this leads us to the idea of measurable functions. These are functions where measurable sets in the target space have measurable preimages in the domain space.

Note “measurable” here is with respect to the respective \sigma-algebras. In particular, if the target is a topological space, it is assumed, unless otherwise specified, to be equipped with the Borel \sigma-algebra, i.e. the smallest \sigma-algebra containing all of the open sets (which is smaller than the family of Lebesgue-measurable sets for \mathbb{R}^d, for instance.)

Littlewood’s three principles

In short: “weird things only ever happen in a vanishingly small period of time”:

  1. Every measurable set of finite measure is nearly a finite union of intervals: given such a set E for any \epsilon > 0 there exists a finite union of closed cubes s.t. \mu(E \Delta F) \leq \epsilon.
  2. Every measurable function is nearly continuous, i.e. for any \epsilon > 0, f|_{A_\epsilon} is continuous for some closed A_\epsilon with \mu(E \setminus A_\epsilon) < \epsilon (Lusin’s theorem.)
    Note that this states the restricted function is continuous as a function
  3. Every convergent sequence of measurable functions is nearly uniformly convergent, i.e. for any \epsilon > 0 is uniformly convergent on some closed A_\epsilon with \mu(E \setminus A) < \epsilon (Egorov’s theorem.)

The above are formulated for Lebesgue measure on \mathbb{R}. More generally:

  1. applies in any measure space with a measure constructed by Carathéodory extension, with “closed cube” replaced by “element of \mathcal{A}“.
  2. requires the domain and target spaces to be topological spaces for continuity to make sense, and the result further requires the target to be second-countable, and the domain to be Hausdorff and equipped with a Radon measure.
  3. requires the target to be a metric space for the idea of uniform convergence to make sense, and also requires target to be separable.

Step 3: Integrals

Once we have measurable functions it doesn’t take very long before somebody starts talking about trying to integrate them. Because measure theory is all about how big things are, and an integral is essentially (some global measure of ) “how big a function is.” Okay, enough with this turbo vagueness already—Eli Stein does a much better job in his preface, anyway.

In the below we assume the maps are functions into the reals; more generally the target space may be any separable metric space without too much change to the statements …

Construction of the Lebesgue integral

  1. The characteristic function \chi_E of a measurable set E is declared to have integral equal to the measure of the set: \int_X \chi_E \,d\mu = \mu(E).
  2. Extend the definition of the integral to simple functions, i.e. linear combinations of characteristic functions, by linearity.
  3. Extend the integral to all non-negative functions, by considering any such function f as a limit of simple functions \varphi_n, and letting the integral of f be the limit of the integrals of \varphi_n.
  4. Extend the integral to all (measurable) functions by decomposing any such function f into positive and negative parts f = f_+ - f_-.

The same process works, more generally, for any \sigma-finite measure space

Fatou and friends

Very useful for proving results with the Lebesgue integral (starting with how it’s well-defined)

  1. Fatou’s lemma: for (f_n) a sequence of non-negative measurable functions, \int \liminf_{n \to \infty} f_n \,d\mu \leq \liminf_{n \to \infty} f_n \,d\mu.
  2. Monotone convergence: for (f_n) a sequence of non-negative measurable functions with f_n \nearrow f, $\lim_{n \to \infty} \int f_n \,d\mu = \int f \,d\mu$
  3. Dominated convergence: for (f_n) a sequence of measurable functions with f_n \to f a.e. and |f_n| \leq g for some integrable g, we have \int |f-f_n| \,d\mu \to 0 and so \int f_n \,d\mu \to \int f\,d\mu as n \to \infty.
  4. Much approximation. Wow. The simple functions, step functions, and continuous functions of compact support are dense in the space of integrable functions.

Fubini’s theorem

Given two measures \mu_1 and \mu_2 on X_1 and X_2 (resp.), we can form a product measure \mu = \mu_1 \times \mu_2 by defining the premeasure (not measure!) \mu(A \times B) = \mu_1(A) \mu_2(B) for all \mu_1-measurable A and \mu_2-measurable B, and then extending this to a measure using Carathéodory extension.

Fubini’s theorem then tells us that integrating against the product measure \mu is the same as integrating against each of the factor measures \mu_i in turn (in either order.)

Whither the Fundamental Theorem of Calculus

The Lebesgue density theorem states that, for any locally integrable f on \mathbb{R}^d, \lim_{m(B) \to 0, B \ni x} \frac{1}{m(B)} \int_B f(y) \,dy = f(x) for almost every x. 

As a corollary: recalling the definition of the derivative, this says taking the Lebesgue integral of any integrable function f and then differentiating will recover the original function f—this is one direction of the Fundamental Theorem.

In the opposite direction: if F is absolutely continuous on [a, b], then F’ exists a.e. and is integrable, and satisfies F(x) - F(a) = \int_a^x F'(y) \,dy for all x \in [a,b]. Absolutely continuity may appear to be an additional hypothesis, but its necessity is clear when we observe that functions that arise as indefinite integrals (i.e. those of the form x \mapsto \int_a^x f(y) \,dy with f an integrable function) are absolutely integrable.