What do you need more monkeys to do: (a) guarantee the writing of all of Shakespeare's plays, or (b) be able to sink an infinite number of basketball shots in a row? OK, I realize that this is entirely inconsequential, but it actually came up a couple of days ago in what would otherwise have been fairly ordinary coffeehouse conversation, so let me bring you up to speed.

The anchor point is the notion that by having an infinite number of monkeys, each of them sitting in front of a typewriter, randomly typing away, you could guarantee that one of them would surely generate a perfect typescript of Hamlet. Or Macbeth. On the other hand, you'd also guarantee that one of them would generate a "perfect" version of Astrology for Dummies.

What this is really about (since few of us are likely to corral together an infinite number of monkeys) is the so-called cardinality of possible books of arbitrary (but finite) length. Now what's cardinality? The cardinality of a finite set is simply the number of things in the set. So, for example, the cardinality of the U.S. Supreme Court justices is nine, usually. The cardinality of the English alphabet is 26. And the cardinality of the sand grains on the Earth is some almost unimaginably large number. But it's still finite.

Infinite sets are a whole 'nother kettle of fish. Maybe the simplest example of an infinite set is ℕ, the set of natural numbers: 0, 1, 2, ... We use the ellipsis (...) to indicate that the natural numbers go on, forever, without end. There is no last number; in other words, infinity is not really a number in the usual sense. Nonetheless, we might say that the cardinality of ℕ is infinity, which is conventionally denoted ∞.

But in so doing, we would be ambiguous, for as it turns out, there are different varieties of infinity. The infinity of ℕ is the smallest possible infinity, but there are larger infinities. That sounds kind of paradoxical: How could a set go on longer than forever?

Well, let's see if we can construct an infinity that's larger than the cardinality of ℕ. The first thing we might do is add some more numbers to ℕ and see if that yields a set with larger cardinality: we might add in all the negative whole numbers, to get ℤ, the set of all integers. Shouldn't ℤ, which is (naively) almost twice as big as ℕ, have nearly twice as large a cardinality?

No, and here we run into one of the fundamental differences between finite sets and infinite sets. Suppose we divide ℕ into two mutually distinct subsets: O (1, 3, 5, ...) and E (0, 2, 4, ...). Intuitively, both O and E are infinite sets. But if ℕ is the union—the sum set, so to speak—of O and E, is ℕ then doubly infinite?

Mathematicians decided that was too much. So cardinality is defined, less intuitively but more consistently, as follows. We say that the cardinality of the English alphabet is 26, because there are 26 letters in the alphabet. Another way of saying the same thing is that the letters of the alphabet can be placed into a one-to-one correspondence with the set of numbers from 1 through 26: 1-A, 2-B, 3-C, and so on, up to 26-Z. You can try a similar exercise with the U.S. Supreme Court justices.

If we define the notion of cardinality this way, then it follows that two sets have the same cardinality if there exists a one-to-one correspondence between the sets. Somewhat amazingly, then, the set of odd numbers O has exactly the same cardinality as ℕ, because one can define a one-to-one correspondence that matches each number in ℕ with a number in O, and vice versa: 0-1, 1-3, 2-5, 3-7, ..., in each case pairing a number n from ℕ with the number 2n+1 from O. It doesn't matter that one can define a correspondence in which the two sets don't match one-to-one; all that matters it that there exists at least one correspondence where they do match.

Pretty clearly, we can do the same thing with E, matching n from ℕ with 2n from E. So all three sets—ℕ, O, and E—have the same cardinality, even though O and E combine to make up ℕ. The question then arises: Are there infinite sets that can't be matched up one-to-one with ℕ, no matter how you try? We can certainly do that for ℤ, matching up all odd numbers m in ℕ with (-1-m)/2 from ℤ, and all even numbers n in ℕ with n/2 from ℤ.

Well then, what about ℚ, the set of rational numbers—all possible fractions involving only whole numbers in the numerator and denominator? Surely that is a bigger set. But as it turns out, ℚ also has the same cardinality as ℕ, even though there are an infinite number of possible numerators and an infinite number of denominators. This state of affairs has led people to write such semi-sensical equations as

∞ + ∞ = ∞

since O and E combine to make ℕ, and

∞ × ∞ = ∞

since all the infinite pairings of ℕ make up ℚ. (By the way, in case you're wondering, ℕ stands for Natural Numbers, of course; ℤ stands for Zahlen, the German word for number; and ℚ stands for Quotient.)

All right, what about ℝ, the set of real numbers? Can that set be placed into a one-to-one correspondence with ℕ? Based on the way things have been going, you might suppose that they could, but in 1891, the German mathematician Georg Cantor (1845-1918) showed that in fact they could not, that ℝ was a strictly larger set than ℕ.

His argument was clever one, employing proof by contradiction. Suppose, Cantor said, that you could find such a one-to-one correspondence. You could write out a catalogue of real numbers then, as follows:

1 - 0.14159265...

2 - 0.71828182...

3 - 0.41421356...

and so forth. Now, suppose you construct a new number g, using the following process: The first digit of g will be the first digit of the first number in your catalogue, plus one; the second digit of g will be the second digit of the second number, plus one; the third digit of g will be the third digit of the third number, plus one; and so on. We could read out g along a diagonal in our catalogue of real numbers, like this:

1 - 0.24159265...

2 - 0.72828182...

3 - 0.41521356...

So g would be the number 0.225... This number g has an amazing property—it cannot appear anywhere in our catalogue of real numbers. Why not? Because it differs from the first number at the first digit, it differs from the second number at the second digit, it differs from the third number at the third digit, ... in short, it differs from every single number in the catalogue.

We have a contradiction: Either g is not a real number, or our catalogue is not complete as we thought it was. Well, g is clearly a real number, so the problem must lie with the other part—our catalogue is not complete. After all, we only assumed we could create such a catalogue. Since it seems we cannot, no one-to-one correspondence exists between ℝ and ℕ.

You might think that there's a simple way around this, if we simply add g to our catalogue, or rearrange it in some way. But Cantor's diagonalization argument, as it is usually called, would apply just as well to this new catalogue. No matter what catalogue you attempt to compile and amend, there's no way to avoid the construction of a real number that's nowhere in the list. Those two sets fundamentally have different cardinalities. And because of that, we can't use the single symbol ∞ to denote their cardinalities. Instead, mathematicians use the aleph-numbers: The cardinality of ℕ is ℵ0 (pronounced "aleph-null"), and under certain commonly held assumptions, that of ℝ is ℵ1 (pronounced "aleph-one").

So what about all those scripts for Shakespeare? Each of them can clearly be entered into a computer document, which is represented by a finite string of digits in the computer. We can therefore place the set of possible scripts into a one-to-one correspondence with the integers in ℕ, meaning that the set of scripts has cardinality ℵ0, so ℵ0 monkeys would be enough for at least one monkey to write any given script. (In fact, ℵ0 monkeys would write that script.)

But what about the infinite string of makes in a basketball game? These are infinitely long strings of basketball shots (each one with ℵ0 shots), so there would be a one-to-one correspondence between those strings and infinitely long sequences of digits—i.e., ℝ, the reals. So it would take ℵ1 monkeys to guarantee that at least one monkey would shoot any given sequence (in particular, the one sequence consisting of all makes).

I don't even want to know about the bananas.

The anchor point is the notion that by having an infinite number of monkeys, each of them sitting in front of a typewriter, randomly typing away, you could guarantee that one of them would surely generate a perfect typescript of Hamlet. Or Macbeth. On the other hand, you'd also guarantee that one of them would generate a "perfect" version of Astrology for Dummies.

What this is really about (since few of us are likely to corral together an infinite number of monkeys) is the so-called cardinality of possible books of arbitrary (but finite) length. Now what's cardinality? The cardinality of a finite set is simply the number of things in the set. So, for example, the cardinality of the U.S. Supreme Court justices is nine, usually. The cardinality of the English alphabet is 26. And the cardinality of the sand grains on the Earth is some almost unimaginably large number. But it's still finite.

Infinite sets are a whole 'nother kettle of fish. Maybe the simplest example of an infinite set is ℕ, the set of natural numbers: 0, 1, 2, ... We use the ellipsis (...) to indicate that the natural numbers go on, forever, without end. There is no last number; in other words, infinity is not really a number in the usual sense. Nonetheless, we might say that the cardinality of ℕ is infinity, which is conventionally denoted ∞.

But in so doing, we would be ambiguous, for as it turns out, there are different varieties of infinity. The infinity of ℕ is the smallest possible infinity, but there are larger infinities. That sounds kind of paradoxical: How could a set go on longer than forever?

Well, let's see if we can construct an infinity that's larger than the cardinality of ℕ. The first thing we might do is add some more numbers to ℕ and see if that yields a set with larger cardinality: we might add in all the negative whole numbers, to get ℤ, the set of all integers. Shouldn't ℤ, which is (naively) almost twice as big as ℕ, have nearly twice as large a cardinality?

No, and here we run into one of the fundamental differences between finite sets and infinite sets. Suppose we divide ℕ into two mutually distinct subsets: O (1, 3, 5, ...) and E (0, 2, 4, ...). Intuitively, both O and E are infinite sets. But if ℕ is the union—the sum set, so to speak—of O and E, is ℕ then doubly infinite?

Mathematicians decided that was too much. So cardinality is defined, less intuitively but more consistently, as follows. We say that the cardinality of the English alphabet is 26, because there are 26 letters in the alphabet. Another way of saying the same thing is that the letters of the alphabet can be placed into a one-to-one correspondence with the set of numbers from 1 through 26: 1-A, 2-B, 3-C, and so on, up to 26-Z. You can try a similar exercise with the U.S. Supreme Court justices.

If we define the notion of cardinality this way, then it follows that two sets have the same cardinality if there exists a one-to-one correspondence between the sets. Somewhat amazingly, then, the set of odd numbers O has exactly the same cardinality as ℕ, because one can define a one-to-one correspondence that matches each number in ℕ with a number in O, and vice versa: 0-1, 1-3, 2-5, 3-7, ..., in each case pairing a number n from ℕ with the number 2n+1 from O. It doesn't matter that one can define a correspondence in which the two sets don't match one-to-one; all that matters it that there exists at least one correspondence where they do match.

Pretty clearly, we can do the same thing with E, matching n from ℕ with 2n from E. So all three sets—ℕ, O, and E—have the same cardinality, even though O and E combine to make up ℕ. The question then arises: Are there infinite sets that can't be matched up one-to-one with ℕ, no matter how you try? We can certainly do that for ℤ, matching up all odd numbers m in ℕ with (-1-m)/2 from ℤ, and all even numbers n in ℕ with n/2 from ℤ.

Well then, what about ℚ, the set of rational numbers—all possible fractions involving only whole numbers in the numerator and denominator? Surely that is a bigger set. But as it turns out, ℚ also has the same cardinality as ℕ, even though there are an infinite number of possible numerators and an infinite number of denominators. This state of affairs has led people to write such semi-sensical equations as

∞ + ∞ = ∞

since O and E combine to make ℕ, and

∞ × ∞ = ∞

since all the infinite pairings of ℕ make up ℚ. (By the way, in case you're wondering, ℕ stands for Natural Numbers, of course; ℤ stands for Zahlen, the German word for number; and ℚ stands for Quotient.)

All right, what about ℝ, the set of real numbers? Can that set be placed into a one-to-one correspondence with ℕ? Based on the way things have been going, you might suppose that they could, but in 1891, the German mathematician Georg Cantor (1845-1918) showed that in fact they could not, that ℝ was a strictly larger set than ℕ.

His argument was clever one, employing proof by contradiction. Suppose, Cantor said, that you could find such a one-to-one correspondence. You could write out a catalogue of real numbers then, as follows:

1 - 0.14159265...

2 - 0.71828182...

3 - 0.41421356...

and so forth. Now, suppose you construct a new number g, using the following process: The first digit of g will be the first digit of the first number in your catalogue, plus one; the second digit of g will be the second digit of the second number, plus one; the third digit of g will be the third digit of the third number, plus one; and so on. We could read out g along a diagonal in our catalogue of real numbers, like this:

1 - 0.24159265...

2 - 0.72828182...

3 - 0.41521356...

So g would be the number 0.225... This number g has an amazing property—it cannot appear anywhere in our catalogue of real numbers. Why not? Because it differs from the first number at the first digit, it differs from the second number at the second digit, it differs from the third number at the third digit, ... in short, it differs from every single number in the catalogue.

We have a contradiction: Either g is not a real number, or our catalogue is not complete as we thought it was. Well, g is clearly a real number, so the problem must lie with the other part—our catalogue is not complete. After all, we only assumed we could create such a catalogue. Since it seems we cannot, no one-to-one correspondence exists between ℝ and ℕ.

You might think that there's a simple way around this, if we simply add g to our catalogue, or rearrange it in some way. But Cantor's diagonalization argument, as it is usually called, would apply just as well to this new catalogue. No matter what catalogue you attempt to compile and amend, there's no way to avoid the construction of a real number that's nowhere in the list. Those two sets fundamentally have different cardinalities. And because of that, we can't use the single symbol ∞ to denote their cardinalities. Instead, mathematicians use the aleph-numbers: The cardinality of ℕ is ℵ0 (pronounced "aleph-null"), and under certain commonly held assumptions, that of ℝ is ℵ1 (pronounced "aleph-one").

So what about all those scripts for Shakespeare? Each of them can clearly be entered into a computer document, which is represented by a finite string of digits in the computer. We can therefore place the set of possible scripts into a one-to-one correspondence with the integers in ℕ, meaning that the set of scripts has cardinality ℵ0, so ℵ0 monkeys would be enough for at least one monkey to write any given script. (In fact, ℵ0 monkeys would write that script.)

But what about the infinite string of makes in a basketball game? These are infinitely long strings of basketball shots (each one with ℵ0 shots), so there would be a one-to-one correspondence between those strings and infinitely long sequences of digits—i.e., ℝ, the reals. So it would take ℵ1 monkeys to guarantee that at least one monkey would shoot any given sequence (in particular, the one sequence consisting of all makes).

I don't even want to know about the bananas.

I think of it this way: any set that can be mapped back and forth to the real number interval [0, 1] is uncountable (\aleph_1). The set of all possible shot sequences can be mapped onto [0, 1] by writing down every sequence as a string of 1's and 0's (i.e. all misses is 0000.... all makes is 1111....) and then multiplying the first digit by 1/2, the second by 1/4, the third by 1/8, and so on. In this mapping every real number [0, 1] has its own unique shot sequence that corresponds to it, and since there are uncountably many numbers in [0, 1] there also must be uncountably many shot sequences.

ReplyDeleteBy the way, the first time I came across this problem it was tempting to try and map all shot sequences to the natural numbers by multiplying the first digit by 1, the second by 2, the third by 4, etc. The problem, of course, is that sequences like 1111... would produce infinity, which doesn't belong to the natural numbers.