Note that this article assumes that the reader is familiar with metric spaces. At least at the level where open sets, closed sets, and subspaces are used. I am in the process of writing an article to describe the basics of metric spaces as well.

Today, we will cover some of the basics of a concept known as compactness. I recall that when I first learned about compactness, I was reading Jay Cummings’ book, Real Analysis: A Long-Form Textbook (you can find it in the list of Resources Worthy of Study on this site; this book is one of the best). In this book, Cummings calls the definition of a compact set “the greatest definition in all of mathematics”. Of course, this caught my attention and set the stage for my engagement with the concept. Looking back, I was very fortunate for this. If it weren’t for this characterization of compactness as being the greatest definition in math(s), I would have likely thought that this was one silly little thing that we would use once and then disregard in the future. Similar to how we define real numbers as equivalence classes of Cauchy sequences of rational numbers, and then disregard that entirely! This is why I mention this to you, in the hope that you, too, will benefit from knowing how important compactness is in the whole of mathematics. Moreover, we will cover more on compact sets here than in that particular book, so I hope this will be useful to all who read it. We will also cover some material about compact sets that is not covered in the classic text Baby Rudin. Hopefully, this will give you more intuition for the concept!

We will cover a few different characterizations of compactness, starting with the honest topological definition, as one of my professors put it. We give some examples of compact and non-compact subsets of real numbers. We then prove using the honest topological definition that the closed unit interval is compact. We then develop another characterization of compactness that deals with sequences of points. Along the way, we `accidentally’ find a third way to characterize compactness. This culminates with Theorem 4, which states that all three characterizations of compactness are equivalent. We end with a few consequences of compactness to really drive home why we care about it.

  1. The Honest Topological Definition
    1. Open Covers
    2. Topological Compactness
  2. Another Characterization of Compactness
    1. Sequential Compactness
    2. Topological Compactness and Sequential Compactness – The Connection
  3. Compactness Three Ways
  4. Consequences of Compactness
    1. Heini-Borel and Bolzano-Wierstrass Theorems
    2. Uniformly Continuity
  5. Extreme-Value Theorem
  6. See You Next Time!

The Honest Topological Definition

To begin, let’s first describe what an open cover is.

Open Covers

Definition (Open Covers): Let (X,d)(X,d) be a metric space. We call a collection {Ui}i\{U_i\}_{i\in \mathcal{I}} of open subsets (in XX) an open cover of XX if

X=iUiX = \bigcup_{i\in \mathcal{I}}U_i

If there is a finite subset II\subset \mathcal{I} such that {Ui}iI\{U_i\}_{i\in I} covers X,X, then we call {Ui}iI\{U_i\}_{i\in I} a finite subcover.

***Note that the index set \mathcal{I} can be finite, countably infinite, or uncountably infinite. It’s an arbitrary index set.***

At first glance, we might wonder why this is useful at all! It seems, at least to me, like something out of left field. But it turns out that this is precisely what we want to focus on. Let’s consider some examples with familiar metric spaces.

Example 1 ([0,1],||)([0,1], |\cdot|): (Click in the Discovery)

Let’s consider the metric space ([0,1],||).([0,1], |\cdot|). Note, this is the subspace of the real line with the Euclidean metric. We claim that the following is an open cover of [0,1].[0,1].

{[0,1100),(9991000,1]}{(1k,11k):k>2}.\Bigg\{ \left[0,\frac{1}{100}\right) \,,\, \left(\frac{999}{1000},1\right] \Bigg\} \cup \Bigg\{ \left(\frac{1}{k}\;,\;1-\frac{1}{k}\right)\;:\;k\gt2 \Bigg\} .

Indeed, each set is open in [0,1].[0,1]. Furthermore, every x[0,1]x\in [0,1] is contained in one of the intervals. Note that this open cover contains infinitely many open sets.

Next, we can see that this particular cover has a finite subcover,

{[0,1100),(11001,10001001),(9991000,1]}.\Bigg\{\left[0,\frac{1}{100}\right)\;,\; \left(\frac{1}{1001}\,,\,\frac{1000}{1001}\right) \;,\; \left(\frac{999}{1000},1\right] \Bigg\}.

So, as it turns out, we could cover [0,1][0,1] with only four of the open sets.

\blacksquare

Example 2 ([0,1],||)([0,1], |\cdot|): (Click in the Discovery)

Let’s consider again the metric space ([0,1],||).([0,1], |\cdot|). We claim that the following is another open cover of [0,1].[0,1].

{[0,1k2):k}{(12k,1]:k}.\Bigg\{ \left[0,\frac{1}{k^2}\right)\;:\;k\in \N \Bigg\} \cup \Bigg\{ \left(\frac{1}{2k},1\right] \;:\;k\in \N\Bigg\}.

Indeed, each set is open in [0,1].[0,1]. Furthermore, every x[0,1]x\in [0,1] is contained in one of the intervals. Note that this open cover contains infinitely many open sets.

Next, we can see that this particular cover has a finite subcover,

{[0,112),(12,1]}.\Bigg\{\left[0,\frac{\;1\;}{\;1^2}\right) \;,\; \left(\frac{1}{2}\,,\,1\right] \Bigg\} .

So, as it turns out, we could cover [0,1][0,1] with only two of the open sets.

\blacksquare

Example 3 ((0,1),||)((0,1), |\cdot|): (Click in the Discovery)

Let’s mix it up and consider the metric space ((0,1),||).((0,1), |\cdot|).

We claim that the following is an open cover of (0,1).(0,1).

{(1k,1):k}%\bigcup_{} \Bigg\{\left(\frac{1}{k}\;,\;1\right)\;:\;k\in \N\Bigg\}

Indeed, each set is open in (0,1).(0,1). Furthermore, we can see that every x(0,1)x\in (0,1) is contained in one of the intervals.

Note that this open cover contains infinitely many open sets. Furthermore, there is no finite subcover that covers all of (0,1).(0,1). Indeed, first note that

(1k+1,1)(1k,1).\left(\frac{1}{k+1}\;,\;1\right) \subset \left(\frac{1}{k}\;,\;1\right) .

Thus, if there were a finite subcover, then we would be able to cover (0,1)(0,1) with only one interval (1K,1).\left(\frac{1}{K},1\right). Where KK\in \N is the largest index that was contained in the finite subcover.

Thus, we see that some open covers have no finite subcovers.

\blacksquare

Example 4 ((0,1),||)((0,1), |\cdot|): (Click in the Discovery)

Let’s continue with the metric space ((0,1),||).((0,1), |\cdot|).

We claim that the following is an open cover of (0,1).(0,1).

{(0,110)}{(1k,1):k}.\Bigg\{ \left(0,\frac{1}{10}\right)\Bigg\}\cup %\bigcup_{k\in \N} \Bigg\{\left(\frac{1}{k}\;,\;1\right)\;:\;k\in \N\Bigg\}.

Indeed, each set is open in (0,1).(0,1). Furthermore, we can see that every x(0,1)x\in (0,1) is contained in one of the intervals.

Note that this open cover contains infinitely many open sets. Furthermore, there is a finite subcover this time. In particular,

{(0,110),(111,1)}\Bigg\{\left(0\;,\;\frac{1}{10}\right) \;,\; \left(\frac{1}{11}\;,\;1\right)\Bigg\}

covers the open unit interval.

\blacksquare

As the previous examples show, there are open covers with finite subcovers and others without. Furthermore, there are metric spaces, such as ((0,1),||)((0,1), |\cdot|) that have both infinite open covers and finite open covers. This leads to the Honest Topological Definition

Topological Compactness

Definition (Compact Sets): Let (X,d)(X,d) be a metric space. We say that XX is a compact metric space if every open cover of XX contains a finite subcover.

That’s it! The greatest definition in all of mathematics. How wonderfully strange and seemingly insignificant!

First observation: we can see that ((0,1),||)((0,1), |\cdot|) is definitely not compact since we found an open cover of ((0,1),||)((0,1), |\cdot|) that did not have a finite subcover in Example 3.

Second observation: We do not have enough information to deduce that ([0,1],||)([0,1], |\cdot|) is compact. To be compact, we must show that every possible open cover of [0,1][0,1] has a finite subcover. We’ve only shown that the two covers in Examples 1 and 2 have a finite subcover. However, it’s true that [0,1][0,1] is compact.

Theorem 1: The metric space ([0,1],||)([0,1], |\cdot|) is compact.

Proof: (Click in the Discovery)

Key Idea: We proceed by contradiction. So, suppose that we have an open cover of [0,1][0,1] that does not have a finite subcover. Then `cut’ [0,1][0,1] in half into [0,12]\left[0,\frac{1}{2}\right] and [12,1].\left[\frac{1}{2},1\right]. At least one of those halves requires an infinite number of open sets in the cover to cover it. Focusing on that half-sized interval, we cut it in half and deduce that one of those quarter-sized intervals requires an infinite number of the open sets in the cover to cover it. We continue this process indefinitely. From there, we choose a point in each of those intervals and produce a Cauchy sequence that converges. From there, we deduce a contradiction.

Notation: Let ([a,b])=ba\ell([a,b]) = b-a be the length of the interval [a,b].[a,b]. Similarly, for open intervals ((a,b))=ba.\ell((a,b)) = b-a.

Begining of Proof: To begin, let {Ui}i\{U_i\}_{i\in \mathcal{I}} be an arbitrary open cover of [0,1].[0,1]. Without loss of generality, we can suppose that every open set in the open cover is an open interval in [0,1].[0,1].

We aim to show that there is a finite subcover. To this end, assume for the hope of a contradiction that there is no finite number of open sets in {Ui}i\{U_i\}_{i\in \mathcal{I}} that cover [0,1].[0,1]. Denote I0=[0,1],I_0=[0,1], and cut it in half to get the intervals [0,12]\left[0,\frac{1}{2}\right] and [12,1].\left[\frac{1}{2},1\right]. It follows that either [0,12]\left[0,\frac{1}{2}\right] or [12,1]\left[\frac{1}{2},1\right] requires an infinite number of open sets from {Ui}i\{U_i\}_{i\in \mathcal{I}} to cover it. Indeed, if not, then both halves [0,12]\left[0,\frac{1}{2}\right] and [12,1]\left[\frac{1}{2},1\right] need only a finite number of open sets to cover, and hence the entire interval [0,1][0,1] needs only a finite number of open sets to cover it.

Let’s assume, without loss of generality, that the interval I1=[12,1]I_1=\left[\frac{1}{2},1\right] requires an infinite number of open sets from {Ui}i\{U_i\}_{i\in \mathcal{I}} to cover it. Remark: Note that I1I0I_1\subset I_0 and (I1)=12(I0)=12.\ell(I_1)= \frac{1}{2}\ell(I_0)= \frac{1}{2}.

Next, we cut I1=[12,1]I_1=\left[\frac{1}{2},1\right] in half and run the same argument again on [12,34]\left[\frac{1}{2},\frac{3}{4}\right] and [34,1].\left[\frac{3}{4},1\right]. Thus, we deduce that at least one of them requires an infinite number of open sets to cover it. Label this interval I2.I_2. Remark: Note that I2I1I0I_2\subset I_1\subset I_0 and (I2)=12(I1)=14.\ell(I_2)=\frac{1}{2}\ell(I_1)= \frac{1}{4}.

We continue in the manner, cut InI_n in half, and choose the subinterval that requires an infinite number of open sets from our cover to cover it and label it In+1.I_{n+1}. By induction, we see that (In+1)=12n+1.\ell(I_{n+1})= \frac{1}{2^{n+1}}.

Thus, we have a sequence of nested closed intervals In+1In,I_{n+1}\subset I_n, each of which requires an infinite number of open sets to cover and is shrinking in length: (In+1)=12n+1.\ell(I_{n+1})= \frac{1}{2^{n+1}}.

We now show that there is nIn={α}\bigcap_{n}I_n=\{\alpha\} for some α[0,1].\alpha\in [0,1]. Indeed, for each of the intervals, In=[xn,yn],I_n = [x_n,y_n], choose the left endpoint of the interval in order to construct a sequence (xn)n[0,1].(x_n)_{n\in \N}\subset [0,1]. Furthermore, we can see that (xn)n(x_n)_{n} is Cauchy. Indeed, let ε>0.\varepsilon>0. There is some NN\in \N such that 12N<ε.\frac{1}{2^N}<\varepsilon. Consequently, for all m,nm,n\in \N such that m,n>Nm,n>N we have xm,xnINx_m,x_n\in I_N (since InINI_{n}\subset I_N and ImINI_{m}\subset I_N ) and thus |xnxm|<12N<ε.|x_n-x_m|<\frac{1}{2^N}<\varepsilon. Therefore, we know that (xn)n(x_n)_{n} converges to some α[0,1].\alpha\in [0,1]. In particular, αIn\alpha\in I_n for all n.n\in \N. Therefore, αnIn.\alpha\in \bigcap_{n}I_n.

To see that nIn={α},\bigcap_{n}I_n=\{\alpha\}, suppose that there was some βα\beta\neq \alpha such that βnIn.\beta \in \bigcap_{n}I_n. Then, let ε=|αβ|>0.\varepsilon = |\alpha-\beta|>0. However, there is some NN\in \N such that 12N<ε,\frac{1}{2^N}<\varepsilon, and therefore both α\alpha and β\beta we cannot be in IN.I_N. Thus, a contradiction and therefore nIn={α}.\bigcap_{n}I_n=\{\alpha\}.

Since, α[0,1]\alpha\in [0,1] it follows that there is some open set UiαU_{i_{\alpha}} in the cover such that αUiα.\alpha \in U_{i_{\alpha}}.

Let (Uiα)=L>0.\ell(U_{i_{\alpha}}) = L>0. There is some NN\in \N such that 12N<L\frac{1}{2^N}<L and consequently, INUiα.I_N\subset U_{i_{\alpha}}. However, the intervals InI_n were constructed so that each of them required an infinite number of open sets to cover them. However, INI_N only requires one open set, namely UiαU_{i_{\alpha}} to cover it. This is our contradiction.

We conclude that {Ui}i\{U_i\}_{i\in \mathcal{I}} has a finite subcover of [0,1].[0,1]. Hence ([0,1],||)([0,1], |\cdot|) is compact.

\square

Some authors and professors motivate compactness as being the next best thing to being finite. We will see why more clearly later, however let’s see that finite spaces are compact.

Theorem 2: Let (X,d)(X,d) be a finite metric space. That is, |X|<.|X|<\infty. Then (X,d)(X,d) is compact.

Proof: (Click in the Discovery)

Let (X,d)(X,d) be a finite metric space and let {Ui}i\{U_i\}_{i\in \mathcal{I}} be an open cover of X.X. It follows that each xUix.x \in U_{i_x}. Thus, {Uix:xX}\{U_{i_x}\;:\;x\in X\} forms a finite subcover, since XX is a finite set.

\square

Another Characterization of Compactness

The open cover definition can be tricky to apply in many situations. Consequently, we are motivated to find other equivalent properties.

Sequential Compactness

In the first section, we learned about what we’ll briefly call topological compactness. In this section, we will learn about sequential compactness.

Definition (Sequentially Compact) : Let (X,d)(X,d) be a metric space. We say that XX is sequentially compact if every sequence (xn)nX(x_n)_{n\in \N}\subset X contains a convergent subsequence.

For those of you who have taken real analysis, you might recall the Bolzano-Weierstrass theorem, which states, using our new vocabulary, that closed and bounded sets of real numbers are sequentially compact. That is, any bounded sequence of real numbers has a convergent subsequence. The standard proof, and the proof given here, rely on limsups and liminfs. However, there is another proof which I find particularly beautiful. So, I will subject those willing to read the following to a proof of this property. It relies on my favorite theorem from elementary real analysis, the monotone convergence theorem.

Theorem 3: Let a,ba,b\in \R be such that a<b.a<b. Then, [a,b][a,b] is sequentially compact.

I.e. every sequence (xn)n[a,b](x_n)_{n\in \N}\subset [a,b] contains a convergent subsequence.

I.e. (part two) Every bounded sequence of real numbers contains a convergent subsequence.

Proof: (Click in the Discovery)

Let (xn)n(x_n)_{n\in \N} be a sequence of real numbers that is bounded below by aa and bounded above by b.b. I.e. (xn)n[a,b].(x_n)_{n\in \N}\subset [a,b]. We claim that (xn)n(x_n)_{n\in \N} contains a monotone subsequence. Note, this proves the theorem since, by the monotone convergence theorem, a bounded monotone sequence converges.

Terminology: Before we begin, we call an element xkx_{k} from our sequence a tiptop element or point if xmxkx_{m}\leq x_k for all m>k.m>k. That is, every point further into the sequence is less than or equal to xk.x_{k}.

There are two cases we need to consider (i) there are an infinite number of tiptop points, or (ii) there are only a finite number of tiptop points.

In case (i), the sequence of tiptop points forms a monotone decreasing subsequence. Thus, we are done.

In case (ii), let xNx_{N} be the last tiptop point. It follows that for all xkx_{k} with kN,k\geq N, there is some xmx_{m} such that xkxm.x_{k} \leq x_m. I.e., there is a monotone increasing subsequence! Thus, we are done again!

\square

Corollary: The metric space ([0,1],||)([0,1], |\cdot|) is sequentially compact.

Topological Compactness and Sequential Compactness – The Connection

As you might have guessed based on the fact that topological compactness and sequential compactness both use the word compact(ness) they are connected. In fact, they are equivalent! This is the content of the next theorem.

Theorem 4: Let (X,d)(X,d) be a metric space. Then,

XX is compact \iff XX is sequentially compact.

Note: No one uses the phrase `topologically compact’. They only say compact, and from now on, we will do the same.

I find this result extremely beautiful.

Proof of Theorem 4 Forward Direction (CompactSequentiallyCompact)(\mathrm{Compact}\implies \mathrm{Sequentially}\;\mathrm{Compact}): (Click in the Discovery)

Let (X,d)(X,d) be a metric space.

Forward: XX is compact \implies XX is sequentially compact.

Let XX be compact and let (xn)nX(x_n)_{n\in \N}\subset X be a sequence of elements from X.X. We aim to show that there is a subsequence (xnk)k(x_{n_k})_k that converges to some x0X.x_0\in X. To this end, consider the following claim:

Claim 1: There is some point x0Xx_0\in X such that for all ε>0,\varepsilon>0, the open ball B(x0,ε)B(x_0\,,\,\varepsilon) contains an infinite number of elements from the sequence (xn)n.(x_n)_n. That is, there is an infinite number of nn\in \N such that xnB(x0,ε).x_n\in B(x_0\,,\,\varepsilon).

We continue by contradiction. Suppose that for all xXx\in X there is some ε(x)>0,\varepsilon(x)>0, such that each open ball B(x,ε(x))B(x\,,\,\varepsilon(x)) contains only a finite number of elements from (xn)n.(x_n)_n. Next, note that the collection {B(x,ε(x)):xX}\{B(x\,,\,\varepsilon(x))\;:\; x\in X\} forms an open cover of X.X. It follows by compactness, there is a finite collection of open balls, those open balls {B(p1,ε(p1)),B(p2,ε(p2)),,B(pN,ε(pN))}\{ B(p_1\,,\,\varepsilon(p_1))\;,\; B(p_2\,,\,\varepsilon(p_2))\,,\, \cdots , B(p_N\,,\,\varepsilon(p_N))\} that cover X.X.

Now, we have reached our contradiction. Indeed, since each open ball in the finite cover contains only a finite number of elements from (xn)n(x_n)_{n} we conclude that there are only a finite number of elements (or natural numbers) in (xn)n.(x_n)_{n}. Hence, claim 1 is true.

Now, using claim 1, we construct a convergent subsequence of (xn)n.(x_n)_{n}.

Let εn=2n>0.\varepsilon_n = 2^{-n}>0. Then, let xn1B(x0,ε1).x_{n_1} \in B(x_0,\varepsilon_1). Which exists by claim 1. Next, since there is an infinite number of elements from (xn)n(x_n)_{n} in B(x0,ε2), B(x_0,\varepsilon_2), there is some xn2B(x0,ε2)x_{n_2} \in B(x_0,\varepsilon_2) such that n2>n1.n_2>n_1. We continue inductively. Suppose that we have determined xn2,,xnk,x_{n_2}, \cdots , x_{n_k}, then we let xnk+1B(x0,εk+1)x_{n_{k+1}}\in B(x_0,\varepsilon_{k+1}) such that nk+1>nk.n_{k+1}>n_k.

We claim that the subsequence (xnk)k(x_{n_k})_k converges to x0.x_0. Indeed, let ε>0.\varepsilon>0. It follows that there is some NN\in \N such that εN=2N<ε.\varepsilon_N = 2^{-N}<\varepsilon. Consequently, for all n>Nn>N we have xnB(x0,εN),x_n\in B(x_0,\varepsilon_N),and thus d(x,x0)<ε.d(x,x_0)<\varepsilon.

\square

To avoid useless complications in proving the backward direction of Theorem 4, we first bring into play a definition and two lemmas.

Lemma 1: Let (X,d)(X,d) be a sequentially compact metric space and {Ui}i\{U_i\}_{i\in \mathcal{I}} be an open cover of X.X. Then, there is some ε>0\varepsilon>0 such that for all ε\varepsilon-balls, BB (i.e. open balls with radius ε\varepsilon), there is some open set UiU_i that contains B.B.

Proof of Lemma 1: (Click in the Discovery)

We prove this by contradiction. That is, suppose that for all ε>0,\varepsilon>0, there is some open ball BB with radius ε\varepsilon that is not contained in any single open set Ui.U_i. In particular, for all 1n>0,\frac{1}{n}>0, there is some open ball BnB_n with radius 1n\frac{1}{n} that is not contained in any open set Ui.U_i.

We focus on the centers of the balls BnB_n and deduce a contradiction, so let pnBnp_n\in B_n be the ball’s center.

Consider the sequence of centers (pn)nX.(p_n)_n\subset X. By sequential compactness, there is some convergent subsequence denoted (pnk)k.(p_{n_k})_k. Let pnkpXp_{n_k}\rightarrow p\in X as k.k\rightarrow \infty. Since {Ui}i\{U_i\}_{i\in \mathcal{I}} covers X,X, there is some UipU_{i_p} that contains p.p. Since UipU_{i_p} is open, there is some δ>0\delta>0 such that B(p,δ)Uip.B(p,\delta)\subset U_{i_p}.

Next, since pnkpp_{n_k}\rightarrow p as kk\rightarrow \infty there is some KK\in \N such that for all k>Kk>K we have d(pnk,p)<δ/2.d(p_{n_k},p)<\delta/2. Furthermore, there is some MM\in \N such that for all m>Mm>M we have 1nm<δ/2.\frac{1}{n_m}<\delta/2. Let 𝒦=max{K,M}.\mathcal{K} = \max{\{K,M\}}. We claim that for all k>𝒦k>\mathcal{K} we have BnkB(p,δ).B_{n_k}\subset B(p,\delta). Indeed, let xBnk=B(pnk,1nm).x\in B_{n_k} =B\left(p_{n_k},\frac{1}{n_m}\right). It follows

d(p,x)d(p,pnk)+d(pnk,x)<δ.d(p,x) \leq d(p,p_{n_k})+d(p_{n_k},x) \lt\delta.

Thus, we see that BnkB(p,δ)Uip.B_{n_k}\subset B(p,\delta)\subset U_{i_p}. This contradicts the assumption that BnkB_{n_k} was not contained in any open set in the cover. Thus, proving Lemma 1.

\square

We now introduce a type of boundedness with a flavor similar to that of compactness. In that, it has to do with covering the space.

Definition (Totally Bounded) : Let (X,d)(X,d) be a metric space. We say that XX is totally bounded if for all ε>0,\varepsilon>0,we can cover XX with a finite number of ε\varepsilon-balls (that is, open balls whose radius is equal to ε\varepsilon).

Remark: Note that this simply says, there exists a finite covering of XX with ε\varepsilon-balls. Not that every cover has a finite subcover.

Lemma 2: Let (X,d)(X,d) be a sequentially compact metric space. Then, XX is totally bounded.

Proof of Lemma 2: (Click in the Discovery)

Key Idea: We will show that if XX is not totally bounded, then there would be a sequence that has no convergent subsequence. Thus contradicting that XX is sequentially compact.

We proceed by contradiction. Suppose that there were some ε>0\varepsilon>0 so that there is no finite subcover of {B(p,ε):pX}.\{B(p\,,\,\varepsilon)\;:\; p\in X\}. Then, we claim:

Claim: There is a sequence of points p1,p2,Xp_1,p_2,\cdots \in X such that d(pi,pj)εd(p_i,p_j)\geq \varepsilon for all ij.i\neq j.

To see this, we construct p1,p2,Xp_1,p_2,\cdots \in X inductively.

Choose any p1X.p_1\in X. Suppose that we have chosen p2,,pnX.p_2,\cdots , p_n\in X. It follows that there is some pn+1p_{n+1} such that d(pn+1,pi)εd(p_{n+1},p_i)\geq \varepsilon for all 1in.1\leq i\leq n. If not, then for all pn+1Xp_{n+1}\in X there is some 1in1\leq i\leq n such that d(pn+1,pi)<ε.d(p_{n+1},p_i)< \varepsilon. However, this implies pn+1B(pi,ε),p_{n+1}\in B(p_i,\varepsilon), and hence Xi=1nB(pi,ε).X\subset \bigcup_{i=1}^n B(p_i,\varepsilon). This contradicts the assumption that there was no finite subcover of {B(p,ε):pX}.\{B(p\,,\,\varepsilon)\;:\; p\in X\}. Hence, we can find some pn+1p_{n+1} such that d(pn+1,pi)εd(p_{n+1},p_i)\geq \varepsilon for all 1in.1\leq i\leq n. Thus our claim holds.

By our claim, the sequence (pn)nX(p_n)_n\in X is such that d(pi,pj)εd(p_i,p_j)\geq \varepsilon for all ij.i\neq j. Thus, (pn)n(p_n)_n has no convergent subsequence since no subsequence can be Cauchy. Indeed, since d(pi,pj)εd(p_i,p_j)\geq \varepsilon for all ij,i\neq j, the sequence never gets arbitrarily close to itself. This contradicts the fact that XX is sequentially compact. Thus, we must have a finite subcover of ε\varepsilon-balls: {B(p1,ε),,B(pn,ε)},\{B(p_1\,,\,\varepsilon)\;,\; \cdots, B(p_n\,,\,\varepsilon)\}, that covers X.X.

\square

We are now ready to finish the proof of Theorem 4.

Proof of Theorem 4 Backward Direction (SequentiallyCompactCompact)(\mathrm{Sequentially}\;\mathrm{Compact}\implies \mathrm{Compact}): (Click in the Discovery)

Backward: XX is sequentially compact\impliesXX is compact.

Let XX be sequentially compact and let {Ui}i\{U_i\}_{i\in \mathcal{I}} be an open cover of X.X. We aim to show that there is a finite subcover. Next, let ε>0\varepsilon>0 be such that Lemma 1 holds. That is, every ε\varepsilon-ball {B(p,ε):pX}\{B(p\,,\,\varepsilon)\;:\; p\in X\} is contained in some open set Uip{Ui}iU_{i_p}\in \{U_i\}_{i\in \mathcal{I}} in the open cover. I.e., for all pX,p\in X, there is some Uip{Ui}iU_{i_p}\in \{U_i\}_{i\in \mathcal{I}} such that B(p,ε)Uip.B(p\,,\,\varepsilon)\subset U_{i_p}.

By Lemma 2, there is a finite subcover of the cover of ε\varepsilon-balls, {B(p1,ε),,B(pn,ε)}\{B(p_1\,,\,\varepsilon)\;,\; \cdots, B(p_n\,,\,\varepsilon)\} that covers X.X. However, this implies that there is finite subcover {Ui1,,Uin}\{U_{i_1},\cdots, U_{i_n}\} covers X.X. Indeed, since every B(pk,ε)Uipk.B(p_k\,,\,\varepsilon)\subset U_{i_{p_k}}.Thus concluding the proof.

\square

Compactness Three Ways

What a wonderful result Theorem 4 is. It gives us a topological and a sequential way to prove compactness. And, as we can see in the proof for Theorem 1 and Theorem 3, one method might be way easier to prove than the other.

In the proof of Theorem 4, more precisely, the backward direction, we introduced the notion of total boundedness. Furthermore, we proved in Lemma 2 that sequential compactness implies total boundedness. This might make us wonder whether or not total boundedness is equivalent to compactness (equivalently, sequential compactness) as well. Unfortunately, this is not the case. Although it is almost the case! We must make one more assumption on XX for total boundedness to imply compactness. This leads us to our main theorem:

Theorem 5: Let (X,d)(X,d) be a metric space. Then the following are equivalent.

  1. XX is compact;
  2. XX is sequentially compact;
  3. XX is totally bounded and complete.

Remark: Recall that a metric space (X,d)(X,d) is complete if every Cauchy sequence (xn)nX(x_n)_n\subset X converges to some xX.x\in X.

Remark: We’ve already seen that (1) iff (2). Furthermore, we’ve seen that sequential compactness implies total boundedness. Thus, to prove Theorem 5, we need only show that sequential compactness implies completeness and that (3) implies (2). Indeed, then we will have shown: (1)\iff(2)\iff(3).

Proof : (Click in the Discovery)

\bigstar The rest of (2) \implies(3)

Let (X,d)(X,d) be a sequentially compact metric space. We aim to prove that XX is totally bounded and complete. By Lemma 2 XX is totally bounded. To prove completeness, consider some Cauchy sequence (xn)nX,(x_n)_{n\in \N}\subset X, we need to show that xnxx_n \rightarrow x for some xX.x\in X.

To begin, by sequential compactness, there is some convergent subsequence (xnk)k(x_{n_k})_{k} with limit xX.x\in X. We claim that xnxx_n \rightarrow x too.

Let (xn)n(x_n)_n is Cauchy, there is some NN\in \N such that for all n,m>Nn,m>N we have d(xn,xm)<ε/2.d(x_n,x_m)<\varepsilon/2. Furthermore, there is some KK\in \N such that for all k>Kk>K we have d(xnk,x)<ε/2.d(x_{n_k},x)<\varepsilon/2. Consequently, for all n>max{nK,N}n>\max{\{n_K,N\}} we have

d(xn,x)d(xn,xnK)+d(xnK,x)<ε.d(x_n,x) \leq d(x_n,x_{n_K}) +d(x_{n_K},x) \lt \varepsilon.

\bigstar Now we aim to show (3) \implies(2)

To this end, let XX be totally bounded and complete, and let (xn)nX.(x_n)_{n\in \N}\subset X. We will show that there is a convergent subsequence of (xn)n.(x_n)_{n}. Here’s the key idea. Let εn=1/n>0.\varepsilon_n = 1/n>0. We cover XX with finitely many ε1\varepsilon_1-balls. We can do this since XX is totally bounded. It follows that there is an open ball that contains an infinite number of elements from (xn)n.(x_n)_{n}. Now ignore all other elements from the original sequence. Cover XX with finitely many ε2\varepsilon_2-balls. Note, from the elements remaining from the original sequence, there is one open ε2\varepsilon_2-ball that contains an infinite number of those! We continue this process and produce a subsequence that is Cauchy. By completeness, it converges. We will make this precise, and in turn change some of the details for convenience.

This type of proof is called a diagonal proof and is always accompanied with a picture that looks like the following.

We will construct the subsequence as follows. First, let (xn(1))n(x_n^{(1)})_{n} be our original sequence (xn)n.(x_n)_{n}. For m2,m\geq 2,we let (xn(m))n(x_n^{(m)})_{n} have the following properties:

  • (xn(m))n(x_n^{(m)})_{n} is a subsequence of (xn(m1))n(x_n^{(m-1)})_{n} and
  • (xn(m))n(x_n^{(m)})_{n} is contained in a ball with radius εm=1/m.\varepsilon_m=1/m.

We use induction. We alrady have (xn(1))n=(xn)n.(x_n^{(1)})_{n} =(x_n)_n. Let m2m\geq 2 and suppose that we have defined (xn(m))n(x_n^{(m)})_{n} for k<m.k<m. Cover XX with finitely many εm\varepsilon_m-balls. It follows that there are an infinite number of elements from (xn(m1))n(x_n^{(m-1)})_{n} in at least one of the εm\varepsilon_m-balls. Denote this particular ball Bm.B_m. We then let x1(m)x_1^{(m)} be the first element from (xn(m1))n(x_n^{(m-1)})_{n} that is in Bm.B_m. Then we let the x2(m)x_2^{(m)} be the second element from (xn(m1))n(x_n^{(m-1)})_{n}that is in Bm.B_m. Etc.

We now construct a diagonal sequence from the (xn(m))n.(x_n^{(m)})_{n}. This is where the picture comes in. Let (an)n(a_n)_{n} be the sequence such that an=xn(n).a_n=x_n^{(n)}. We can see that (an)n(a_n)_{n} is indeed a subsequence of (xn)n.(x_n)_{n}. Furthermore, we see that (an)n(a_n)_{n} is Cauchy. Indeed, let ε>0.\varepsilon>0. There is some NN\in \N such that εn=1/n<ε/2.\varepsilon_n = 1/n<\varepsilon/2. Thus, for all m,n>N,m,n>N, we have (an)n>N(a_n)_{n>N} being a subsequence of (xn(N))n.(x_n^{(N)})_{n}. Thus, d(an,am)<2εn<ε.d(a_n,a_m)<2\varepsilon_n<\varepsilon. This concludes the proof.

\square

Consequences of Compactness

Heini-Borel and Bolzano-Wierstrass Theorems

An easy corollary of Theorem 5 is the following deep result we tend to learn in real analysis.

Theorem 6 (Heini-Borel and Bolzano-Wierstrass Theorems): Let EE be a subset of .\R. Then the following are equivalent.

  1. EE is compact;
  2. EE is sequentially compact;
  3. EE is closed and bounded.

Uniformly Continuity

Another great theorem is the generalization of the following: If f:[a,b]f:[a,b]\rightarrow \R is continuous, then ff is uniformly continuous. But first, recall what it means for a function to be uniformly continuous in general metric spaces.

Definition (Uniformly Continuous) : Let (X,dX)(X,d_X) and (Y,dY)(Y,d_Y) be metric spaces. We say that f:XYf:X\rightarrow Y is uniformly continuous on XX if for all ε>0,\varepsilon>0, there exists a δ>0\delta>0 such that dY(f(a),f(b))<εd_Y(f(a),f(b))<\varepsilon whenever d(a,b)<δd(a,b)<\delta where a,bX.a,b\in X.1

Let’s show that continuous functions on compact sets are uniformly continuous.

Theorem 6: Let (X,dX)(X,d_X) and (Y,dY)(Y,d_Y) be metric spaces. If (X,dX)(X,d_X) is compact and f:XYf:X\rightarrow Y is continuous, then ff is uniformly continuous.

Proof : (Click in the Discovery)

Let (X,dX)(X,d_X) be a compact metric space and let f:XYf:X\rightarrow Y be continuous.

Let ε>0.\varepsilon>0. We aim to find some δ>0\delta>0 such that for all a,bXa,b\in X where dX(a,b)<δd_X(a,b)<\delta we have dY(f(a),f(b)<ε.d_Y(f(a),f(b)<\varepsilon.

To begin, since ff is continuous on X,X, for all pXp\in X there is some δp>0\delta_p>0 so that dX(p,q)<δpd_X(p,q)<\delta_p implies dY(f(p),f(q))<ε/2.d_Y(f(p),f(q))<\varepsilon/2. Note that δp\delta_p depends on the point pXp\in X chosen. Next, we cover XX with δp/2\delta_p/2-balls. I.e., we’re using the open cover, {BX(p,δp/2)}pX\{B_X(p,\delta_p/2) \}_{p\in X} where BX(p,δp/2)={qX:dX(p,q)<δp/2}B_X(p,\delta_p/2) = \{q\in X\;:\;d_X(p,q)<\delta_p/2\} is an open ball in (X,dX).(X,d_X). Since (X,dX)(X,d_X) is compact, there is a finite subcover denoted {BX(p1,δp1/2),,BX(pn,δpn/2)}.\{B_X(p_1,\delta_{p_1}/2),\cdots, B_X(p_n,\delta_{p_n}/2) \}.

Let δ:=12min{δpi:1in}>0.\delta:=\frac{1}{2}\min{\{\delta_{p_i}\;:\;1\leq i\leq n\}}>0. We claim this does the job. Indeed, for all a,bXa,b\in X where dX(a,b)<δd_X(a,b)<\delta we have a,bB(pi,δpi/2)a,b\in B(p_i,\delta_{p_i}/2) for some 1in.1\leq i\leq n. Indeed, note that aB(pi,δpi/2)a\in B(p_i,\delta_{p_i}/2) for some 1in1\leq i\leq n since the open balls cover X.X. Furthermore, we have dX(b,pi)dX(a,b)+dX(a,pi)<δ+δpi2δpi.d_X(b,p_i)\leq d_X(a,b)+d_X(a,p_i) <\delta + \frac{\delta_{p_i}}{2}\leq\delta_{p_i}. Hence a,bB(pi,δpi/2)a,b\in B(p_i,\delta_{p_i}/2) as claimed.

Finally, by definition of δ,\delta, dX(a,b)<δd_X(a,b)<\delta implies

dY(f(a),f(b))dY(f(a),f(pi))+dY(f(pi),f(b))<ε2+ε2=ε.d_Y(f(a),f(b)) \leq d_Y(f(a),f(p_i)) +d_Y(f(p_i),f(b)) \lt \frac{\varepsilon}{2}+ \frac{\varepsilon}{2}= \varepsilon.

Thus concluding the proof.

\square

The reason why compactness was crucial was so that we could set δ:=12min{δpi:1in}\delta:=\frac{1}{2}\min{\{\delta_{p_i}\;:\;1\leq i\leq n\}} and have it be positive. This is because the minimum of a finite collection of positive numbers is still positive. If we didn’t have compactness and still tried to run the same argument, we’d have to have δ:=12inf{δp:pX}.\delta:=\frac{1}{2}\inf{\{\delta_{p}\;:\;p\in X\}}. There is no reason why this has to be non-zero. That’s the key.

Extreme-Value Theorem

Recall the Extreme Value Theorem from real analysis. We now give a more general form of it using compact metric spaces.

Theorem 7 (Extreme-Value Theorem): Let (X,d)(X,d) and (,||)(\R,|\cdot|) be metric spaces. If (X,dX)(X,d_X) is compact and f:Xf:X\rightarrow \R is continuous, then ff attains its maximum and minimum. That is, there are a,bXa,b\in X such that

f(a)=inf{f(x):xX}andf(b)=sup{f(x):xX}.f(a) = \inf\{f(x)\;:\;x\in X\} \;\;\;\mathrm{and}\;\;\;f(b) = \sup\{f(x)\;:\;x\in X\}.

However, to prove this theorem, we will actually prove another, more general theorem.

Theorem 8 (The continuous image of a comact set is compact): Let (X,dX)(X,d_X) and (Y,dY)(Y,d_Y) be metric spaces. If (X,dX)(X,d_X) is compact and f:XYf:X\rightarrow Y is continuous, then f(X)={yY:f(x)=y,forsomexX}f(X) = \{y\in Y\;:\;f(x) = y,\;\mathrm{for}\;\mathrm{some}\;x\in X\} is compact.

In this proof, we will need to use an equivalent definition of continuity to the standard εδ\varepsilon-\delta definition that relies on open sets. It goes as follows:

Definition/Theorem (Continuous) : Let (X,dX)(X,d_X) and (Y,dY)(Y,d_Y) be metric spaces. We say that f:XYf:X\rightarrow Y is continuous on XX if for all open sets UY,U\subset Y, the preimage

f1(U)={xX:f(x)U}Xf^{-1}(U) = \{x\in X\;:\;f(x)\in U\}\subset X

is open.

I apologize for skipping a proof that shows that the above characterization continuity is the same as the standard εδ\varepsilon-\delta characterization. This article is long, and so some stuff must get cut that’s not about compactness. To get started, use the fact that we can express the εδ\varepsilon-\delta definition in terms of open εδ\varepsilon-\delta-balls. This might be a fun problem for you to try and work out on your own!

Proof of Theorem 8 : (Click in the Discovery)

Let (X,dX)(X,d_X) be a compact metric space and let f:XYf:X\rightarrow Y be continuous. Since we aim to show that f(X)f(X) is compact, we let {Vi}i\{V_i\}_{i\in \mathcal{I}} be an open cover of f(X).f(X). Note, ViY.V_i\subset Y. We need to find a finite subcover.

First, since ff is continuous, the preimages of the open sets in the cover are open. Thus, {f1(Vi)}i\{f^{-1}(V_i)\}_{i\in \mathcal{I}} is an open collection of open sets in X.X. We claim that {f1(Vi)}i\{f^{-1}(V_i)\}_{i\in \mathcal{I}} is an open cover of X.X. Indeed, let xXx\in X and observe that f(x)f(X).f(x)\in f(X). Thus, f(x)Vi0f(x)\in V_{i_0} for some i0i_0 hence, xf1(Vi0).x\in f^{-1}(V_{i_0}). Consequently, Xif1(Vi).X\subset \bigcup_{i\in \mathcal{I}}f^{-1}(V_i).

Next, by the compactness of X,X, there is a finite subcover {f1(Vik)}k=1n\{f^{-1}(V_{i_k})\}_{k=1}^n that covers X.X. We claim that the corresponding {Vik}k=1n\{V_{i_k}\}_{k=1}^n cover f(X).f(X). Indeed, let f(x)f(X).f(x)\in f(X). It follows that xf1(Vik)x\in f^{-1}(V_{i_k}) and thus f(x)Vik.f(x)\in V_{i_k}.

Thus concluding the proof.

\square

Note that the extreme value theorem follows directly since compact sets of real numbers are closed and bounded.

See You Next Time!

We have covered a lot of material today. Hopefully, you have found it interesting, helpful, or useful. Or all three!

As stated earlier, some people motivate compactness as being the next best thing to being finite. The reason is that finite metric spaces always have finite subcovers, they always have convergent subsequences, etc. This follows from our main theorem, Theorem 5.

There is more to be said about compact sets; for instance, we didn’t mention anything about metric spaces of continuous functions. This leads to a discussion that culminated in the Arzelà–Ascoli theorem. However, that is a story for another day! In the mean

Since math(s) is not a spectator sport, I will leave a few exercises for you to struggle with and practice! You’re welcome! As always, I’m not great at ending these articles, so we will just end it now with the exercises. Leave your answers below! Oh, if you find any errors or typos, please leave a comment so that they can be fixed!

  1. Show that closed subsets of compact sets are compact.
  2. Show that finite metric spaces are sequentially compact using only the definition of sequential compactness. That is, without using any of the theorems here!
  3. Prove that the εδ\varepsilon-\delta definition of continuity is equivalent to the open sets definition of continuity.
  4. Here’s a classic: Let (X,d)(X,d) be a metric space. A collection of closed sets {Eα}α𝒜\{E_{\alpha}\}_{\alpha\in \mathcal{A}} in XX has the finite intersection property if for all finite subsets A𝒜,A\subset \mathcal{A}, the intersection αAEα\bigcap_{\alpha\in A}E_{\alpha} is non-empty. Here’s the question: prove that the following are equivalent
    • (X,d)(X,d) is compact,
    • (X,d)(X,d) has the following property: If the collection of closed subsets {Eα}α𝒜\{E_{\alpha}\}_{\alpha\in \mathcal{A}} in XX that has the finite intersection property, then
α𝒜Eα.\bigcap_{\alpha\in \mathcal{A}}E_{\alpha}\neq \emptyset.

***Note this intersection is over all of α𝒜.\alpha\in \mathcal{A}.***

Be Kind. Be Curious. Be Compassionate. Be Creative.

And Have Fun!


Footnote:

  1. Note the difference between being continuous and uniformly continuous. It’s easier to see when written in symbols:
    Continuous on XX: If aX,ε>0,δ(a)>0:bwhered(a,b)<δd(f(a),f(b))<ε.\forall a\in X,\,\forall \varepsilon>0,\; \exists \delta(a)>0 \;:\; \forall b\;\mathrm{where}\;d(a,b)<\delta \implies d(f(a),f(b))<\varepsilon.
    Uniformly Continuous on XX: If ε>0,δ>0:a,bXwhered(a,b)<δd(f(a),f(b))<ε.\forall \varepsilon>0,\; \exists \delta>0 \;:\; \forall a,b\in X\;\mathrm{where}\;d(a,b)<\delta \implies d(f(a),f(b))<\varepsilon.
    The key difference is the fact that δ\delta is dependent on aa when the function is only continuous. However, δ\delta is independent of aa when the function is uniformly continuous.
    ↩︎

Leave a comment