Cantor-Schröder-Bernstein theorem

by Patrick Stevens Aug 6 2016 updated Aug 27 2016

This theorem tells us that comparing sizes of sets makes sense: if one set is smaller than another, and the other is smaller than the one, then they are the same size.

[summary: The Cantor-Schröder-Bernstein theorem tells us that if one set is smaller-than-or-equal-to another, and the other is smaller-than-or-equal-to the one, then they are the same size. That is, "comparison of set sizes" behaves itself, analogously to the fact in the natural numbers that it can't be the case that both $~$1 < 2$~$ and $~$2<1$~$.]

[summary(Technical): The Cantor-Schröder-Bernstein theorem states that the [class_set_theory class] of cardinals forms a total order. (Though not a totally ordered set, because [cardinals_form_a_proper_class there is no set of all cardinals]).]

Recall that we say a set $~$A$~$ is smaller than or equal to a set $~$B$~$ if there is an injection from $~$A$~$ to $~$B$~$. The set $~$A$~$ has the same size as $~$B$~$ if there is a bijection between $~$A$~$ and $~$B$~$.

The Cantor-Schröder-Bernstein theorem states that if $~$A$~$ is smaller than or equal to $~$B$~$, and $~$B$~$ is smaller than or equal to $~$A$~$, then $~$A$~$ and $~$B$~$ have the same size. This tells us that the arithmetic of cardinals is well-behaved in that it behaves like a total order. It is similar to the arithmetic of the natural numbers in that it can never be the case that simultaneously $~$a < b$~$ and $~$b < a$~$.


There are several proofs, some concrete and some less so.

Concrete proof

A clear explanation of the intuition of this proof has been written by Richard Evan Schwartz; see page 61 of the linked PDF, or search on the word "dog" (which appears in the first page of the explanation). [comment: The intuition is Math0 suitable.]

Let $~$f: A \to B$~$ and $~$g: B \to A$~$ be injections; we will define a bijective function $~$h: A \to B$~$.

Because $~$f$~$ is injective, if $~$f$~$ ever hits $~$b$~$ (that is, if there is $~$a \in A$~$ such that $~$f(a) = b$~$) then it is possible to define $~$f^{-1}(b)$~$ to be the unique $~$a \in A$~$ such that $~$f(a) = b$~$; similarly with $~$g$~$. The slogan is "if $~$f^{-1}(a)$~$ exists, then it is well-defined: there is no leeway about which element we might choose to be $~$f^{-1}(a)$~$".

Fix some $~$a \in A$~$, and consider the sequence $$~$\dots, f^{-1}(g^{-1}(a)), g^{-1}(a), a, f(a), g(f(a)), \dots$~$$ Now, this sequence might not extend infinitely to the left; it may not even get past $~$a$~$ to the left (if $~$g^{-1}(a)$~$ doesn't exist, for instance). %%note:On the other hand, perhaps the sequence does extend infinitely to the left.%% Also, the sequence might duplicate elements: it might be the case that $~$gfgf(a) = a$~$, for instance. %%note:And maybe there is a repeat somewhere to the left, too.%%

Similarly, we can fix some $~$b \in B$~$, and consider $$~$\dots g^{-1} f^{-1}(b), f^{-1}(b), b, g(b), f(g(b)), \dots$~$$

Every element of $~$A$~$, and every element of $~$B$~$, appears in one of these chains. Moreover, if $~$a \in A$~$ appears in two different chains, then in fact the two chains are the same %%note:Though maybe we're looking at a different place on the same chain. If we compare the $~$g^{-1} f^{-1}(b)$~$-chain with the $~$b$~$-chain, we see they're the same chain viewed two different ways: one viewing is two places offset from the other viewing.%%, because each element of the chain specifies and is specified by the previous element in the chain (if it exists) and each element of the chain specifies and is specified by the next element of the chain.

So every element of $~$A$~$ and every element of $~$B$~$ is in exactly one chain. Now, it turns out that there are exactly four distinct "types" of chain.

Have we actually made a bijection? Certainly our function is well-defined: every element of $~$A$~$ appears in exactly one chain, and we've specified where every element of $~$A$~$ in any chain goes, so we've specified where every element of $~$A$~$ goes.

Our function is surjective, because every element of $~$B$~$ is in a chain; if $~$b \in B$~$ has an element $~$a$~$ of $~$A$~$ before it in its chain, then we specified that $~$h$~$ takes $~$a$~$ to $~$b$~$, while if $~$b \in B$~$ is at the leftmost end of its chain, we specified that $~$h$~$ takes $~$g(b)$~$ (that is, the following element in the chain) to $~$b$~$.

Our function is injective. Since the chains don't overlap, and the first three cases of "what a chain might look like" have no repeats at all, the only possible way an element of $~$B$~$ can be hit twice by $~$h$~$ is if that element lies in one of the cyclical chains. But then to elements of $~$A$~$ in that chain, $~$h$~$ assigns the following element of $~$B$~$; so $~$b \in B$~$ is hit only by the preceding element of $~$A$~$, which is the same in all cases because the chain is a cycle. %%note:The picture in the linked intuitive document makes this much clearer.%%

Proof from the [knaster_tarski_theorem Knaster-Tarski theorem]

This proof is very quick, but almost completely opaque. It relies on the [knaster_tarski_theorem Knaster-Tarski fixed point theorem], which states that if $~$X$~$ is a [complete_poset complete poset] and $~$f: X \to X$~$ is [order_preserving_map order-preserving], then $~$f$~$ has a [-fixed_point] (i.e. $~$x$~$ such that $~$f(x) = x$~$).

Let $~$f: A \to B$~$ and $~$g: B \to A$~$ be injective.

We are looking for a [set_partition partition] $~$P \cup Q$~$ of $~$A$~$, and a partition $~$R \cup S$~$ of $~$B$~$, such that $~$f$~$ is injective from $~$P$~$ to $~$R$~$, and $~$g$~$ is injective from $~$S$~$ to $~$Q$~$. (Then we can just define our bijection $~$A \to B$~$ by "do $~$f$~$ on $~$P$~$, and do $~$g^{-1}$~$ on $~$Q$~$".)

Now, the function $~$P \mapsto A \setminus g(B \setminus f(P))$~$ is order-preserving from the Power set $~$\mathcal{P}(A)$~$ to $~$\mathcal{P}(A)$~$ (ordered by inclusion), because there is an even number of complements.

But $~$\mathcal{P}(A)$~$ is complete as a poset ([power_set_poset_is_complete proof]), so by Knaster-Tarski there is a set $~$P$~$ such that $~$P = A \setminus g(B \setminus f(P))$~$.

This yields our partition as required. [todo: spell this out]