The Null Hypodermic

Monday, February 14, 2011

Matching Up in Hyperspace (or, Thirty Dancing)

Maybe it's because I've been writing about basketball a lot, but I thought today I'd do something a little different before continuing on, as promised, with a second game theory post.

A while ago, I remember reading an analogy about why it is that oil and water don't mix. (I don't remember where I read it, though, so if you recognize it, please tell me.) Is it that water molecules only "like" water molecules, and oil molecules only "like" oil molecules? Not at all—they all like water molecules!

A water molecule is often drawn as H-O-H, but that drawing is a bit misleading. The hydrogen atoms are actually attached at an angle, as below.

This one looks a bit like a Japanese cartoon character, if you ask me. At any rate, this asymmetry, top to bottom (as drawn here), means that we can speak of an oxygen end (the bottom) and a hydrogen end (the top). What's more, because of the way that electrons are arranged in each atom, the oxygen atom tends to draw electrons away from the hydrogen atoms. The oxygen end, so to speak, has more electrons hanging around it than the hydrogen end. Since electrons are negatively charged, the water molecule has a positive pole (the hydrogen end) and a negative pole (the oxygen end), and we say that the water is a polar molecule.

Water molecules attract each other because they are polar. The positively charged hydrogen end of one attracts the negatively charged oxygen end of another. In steam, the gaseous form, this is almost impossible to make out, because the molecules are too far apart and energetic, bouncing around far too wildly to show any mutual attraction. However, in ice, the solid form, the attraction is much more obvious.

It's a bit hard to tell which hydrogen atoms are associated with each oxygen atoms, but that's because in ice, the bonds are a bit confused. Even so, however, it's clear that we don't have water molecules bonding together oxygen-to-oxygen, or hydrogen-to-hydrogen. They only attach oxygen-to-hydrogen (in the hexagonal arrangement that yields those lovely snowflakes), because the molecules are polar that way. That's the way water molecules "like" each other.

Liquid water is intermediate between ice and steam. The molecules aren't fixed in place to each other as they are in ice, but neither are they bouncing wildly as they are in steam. Instead, they wander amongst each other, like people milling about in a crowd. And as they wander around, they stick to each other a bit, on account of their polarity. They attach and cohere, which makes water bead up, among other things.

What about oil molecules? Oil molecules tend to be symmetric in such a way that there is no clear polar end as there is in water. As a result, they are much less polar than water molecules are. Nonetheless, being weakly polar (under appropriate circumstances), they "like" other polar molecules, too. So why don't they attach to the water molecules, too?

The reason is that there is only so much room for molecules to attract each other. And here's where the analogy I mentioned earlier comes into play. You often find, at a school, that the most popular kids date other most popular kids (when they date), and the least popular kids date other least popular kids (again, when they date). Why is that? Is it that the least popular kids aren't attracted to the most popular kids? Well, it might sometimes be because of that, but often, they are attracted to the most popular kids; that is, after all, part of what makes someone most popular.

What gets in the way, however, is that the most popular kids, like most others perhaps, are also attracted to the most popular kids, and since such pairings satisfy both attractions, they get paired first. Then the next most popular kids pair up with other next most popular kids, they get paired next. And so on down the line. Or so the story goes.

Of course, it isn't quite that neat and clean with kids, but it is a reasonable approximation with what happens when you combine oil and water. They don't mix because the most popular water molecules hook up with other most popular water molecules, while the least attractive oil molecules are left hooking up with each other.

So much for oil and water. But now let's go back to that analogy, which as it so happens is what I really wanted to talk about. (The rest of that science was just for show?!) It doesn't ring true because we all know couples where we think, "Wow, she paired up with him?" How does that happen? It happens because people aren't one-dimensional.

Suppose all people were one-dimensional. Then you could rate each person with a number x—say, from 0 to 100. (I hate it when things are rated from 1 to 100. What's middle-of-the-road on such ratings? 50.5?) In such a case, if you have two 100's, wouldn't they choose each other above all others? You couldn't easily see a 100 pairing with a 25, if there's another 100 to choose from. Under such circumstances, the nth highest-rated male would always match up with the nth highest-rated female. Just like the kids at our hypothetical school.

Note: For reasons I despise (expositional convenience, basically), I'm writing this out heterosexually. Let it be clear that this isn't mandated in any way, and I'm aware of that. This treatment unfortunately makes it easiest for me to separate out two groups and draw what amounts to a bipartite graph between them. Sorry!

We might say, callously, that only one pair of people would say they feel completely satisfied with the pairing; everyone else is "envious" in the sense that there's someone else with whom they would rather have paired up. That's inevitable with one-dimensional people.

So let's give people another dimension: Let them now be rated with two numbers (x, y). Now, there is no universal and complete ordering on people. We might agree that if someone has both numbers higher than someone else, they are more appealing, but there is no universally accepted way to compare two people with one number higher and one number lower. This is akin to the problem with PER. It's entirely possible that everyone could be envy-free.

Here's what I mean. Suppose you have three males and three females. The three males are (60, 30), (50, 50), and (30, 60). So are the three females. Now there's no way you can say that the (60, 30) male is inherently superior to the (50, 50) male, or vice versa. The same is true of any other two males, or any two females. To decide amongst the alternatives, one needs a discriminating function of some sort. Let's say your function is 2x+2y. Then you would rank your three choices 180, 200, and 180, and you would choose the (50, 50) over either the (60, 30) or the (30, 60). If, on the other hand, your function was 3x+y, you would rank your choices 210, 200, and 150, and you'd choose the (60, 30) over the other two. Finally, if your function was x+3y, you'd pick the (30, 60) first. So it's possible for each of the alternatives to be first in someone's eyes.

Of course, to be a completely satisfactory pairing, both sides of the pairing must feel they got the best catch. But consider the (60, 30) male. Being a high-x kind of guy, he naturally values x more than y, perhaps, and his discriminating function will reflect that. (Some people, all they care about is x.) He might be exactly the sort of guy with a function like 3x+y, and would therefore pick the (60, 30) female. She, thinking likewise, would pick the (60, 30) male back. Likewise, the (50, 50) people might pair up with each other as mutually optimal choices, and the (30, 60) people too. It doesn't have to match that way, of course; it just has to match one-to-one. Maybe the (60, 30)'s love the (30, 60)'s, for instance, and vice versa.

On the other hand, this matching leaves someone who's (40, 40) out in the cold, because no discriminating function will rate them ahead of everybody else. Whoever matched up with them would always be upset that they didn't at least match up with ol' (50, 50).

It boils down to who's on the Pareto front. The Pareto front is made up of everyone who isn't universally worse than some other option. An illustration of this in two dimensions should hopefully make it clear:

Everyone on the Pareto front could be someone's optimal choice; everyone else would be a consolation prize. It's possible that everyone would be on the front, but it's unlikely, given a random selection of people.

Let's not be too hasty, though. There's an interesting dependency between dimensionality and being on the hull. In one dimension, exactly one person is on the front (barring ties); everyone else is beneath him or her. In two dimensions, it's a bit more complex, but suppose you had a hundred people, evenly spread out between (0, 0) and (100, 100). On average, maybe five people would be on the front. (The actual average is the sum 1 + 1/2 + 1/3 + ... + 1/100.)

Now let's increase it to three dimensions. If you have a hundred people spread out between (0, 0, 0) and (100, 100, 100), on average about 14 people would be on the front. All 14 could be the optimal choice for some prospective mate. As the number of dimensions goes up (and the number of possible discriminating functions, too!), the percentage of people on the front also goes up. With four dimensions, the average number of people on the front is 28; with five, it's 44; with six, 59—more than half! Ten dimensions are sufficient to push it up to 94, and by the time you have, oh, let's say thirty dimensions, the odds are about ten million to one in favor of every last person being on the front. Remember, it isn't necessary to have a highest value in any of the dimensions to be on the front; all you need is to not be lower than anyone else in all of the dimensions. As the number of dimensions goes up, it becomes awfully unlikely that you'll be lower than anyone else in every single dimension. We can have an entirely envy-free matching, all with the help of increased dimensionality.

OK, this may seem completely crazy, and I wouldn't blame you for calling shenanigans. Who would actually go and rank people using a set of thirty numbers? But this is exactly what one of those on-line dating sites advertises it does. Well, not exactly; it actually claims to use 29 dimensions. Why 29? I would imagine because it sounds somewhat more scientific than thirty. But beyond that, I think that they use as many as 29 because it makes it almost inevitable that you'll be on the front, that there'll be someone who you find optimal (or very nearly so), for whom you will likewise be optimal (or very nearly so). And although I think that's partly a marketing gimmick, I think there's some truth to it, too; if there weren't, the human race would have died out long ago.

I mean, how else does Ric Ocasek land Paulina Porizkova? For real, I mean!

Tuesday, February 8, 2011

A (Kind of) Gentle Introduction to Game Theory

Prefatory to more basketball talk, I want to take a bit of time out to describe what I think is a rather elegant area of mathematics: game theory. Even the name is elegant—simple and to-the-point. As its name implies, game theory is the study of game strategies and tactics from a mathematical point of view. Rather than describe its foundations and move on from there, as a textbook would, I'm going to leap right in and use game theory in a couple of simple situations, which I hope will be a less obscure way of conveying what it's all about.

Suppose that David and Joshua are playing a friendly game of Global Thermonuclear War. At some point, both players have to decide whether to launch an attack or not. If David attacks and Joshua does not, then David wins and earns 5 points (this is a game, after all) and Joshua earns 1 point for being peaceable. Conversely, if Joshua attacks and David does not, Joshua earns 5 points and David earns just 1 point. If both attack, the Earth is rendered a wasteland and neither side earns any points; if both sides do not attack, everybody wins and both sides earn 6 points. The foregoing can be summarized in tabular form, as below.

David's payoffs are shown in blue, Joshua's in red. Let's run through a simple game-theoretical analysis of GTW. If we focus on just the blue numbers (David's payoffs), we see that if Joshua attacks, David's best strategy is to stand down (1 > 0). If Joshua stands down, David's best strategy is, again, to stand down (6 > 5). No matter what Joshua does, in short, David should stand down.

Moving over to the red numbers, we come to a similar conclusion for Joshua's strategy: No matter what David does, Joshua is better off standing down (either 1 > 0, or 6 > 5). As a result, both sides stand down; the only way to win is, indeed, not to play. Whew!

But maybe we shouldn't all relax quite yet. Suppose that we were to adjust the payoff matrix (which is what we call that table up there). Heads of state often get a little nationalistic, and they may well decide that a world without the enemy is better after all, even if we do have to suffer from a little radioactive fallout. At the same time, perhaps it is better to go out fighting and take out the enemy, even as we ourselves are getting wiped out. Then, possibly, the payoff matrix would look like this:

The numbers have changed only a little, but the conclusion is quite different: This time, no matter what Joshua does, David is better off attacking, and no matter what David does, Joshua is better off attacking. The upshot is that both sides end up attacking and wiping each other off the map. The rest of you, I hope you look forward to serving your cockroach overlords...

In the two examples above, the eventual solution has the property that the strategy for both sides was the best they could do, no matter what the opponent did. Such a solution is called a Nash equilibrium, after John Forbes Nash, who won a Nobel Prize in economics for his work in such games. (Yes, the Beautiful Mind guy.) In fact, in each case, the winning strategy was a single choice: either "always attack" or "always stand down."

That is not always the case. Consider a rather less violent game of football (well, somewhat less violent). On a crucial third down, the Steelers can choose to run the ball, or throw the ball; the Packers, on the other hand, can choose to defend the pass or defend the run.

Here's how we might model things. We'll let the payoff be simply the chance, the probability, that the Steelers make a first down. We'll also say that if the Steelers pass, they make a first down 60 percent of the time when the Packers defend the run, but only 20 percent of the time when they defend the pass. If the Steelers run, they make a first down 50 percent of the time when the Packers defend the pass, but only 30 percent of the time when they defend the run. Here's the payoff table:

(We've included the Packers' payoffs in red, although they can be derived from the Steelers' payoff by subtracting from 100 percent.) This time, matters are not as clear cut: For obvious reasons, the Steelers' best option depends on what the Packers do, and the Packers' best option depends on what the Steelers do. But, for our purposes, they both have to show their hands at the same time. What does game theory have to say about this kind of situation?

To figure that out, we'll have to consider a new kind of strategy, called a mixed strategy. A mixed strategy (as opposed to the pure strategies we considered above) is simply one that chooses each of the options with a certain probability. For instance, one possible mixed strategy the Steelers could employ is to run the ball half the time, and pass it the other half. Similarly, the Packers could defend the pass 60 percent of the time, and defend the run 40 percent of the time. There are an infinite number of different mixed strategies both teams could employ. How do we figure out what mixed strategies are actually the best for each side?

Here's where game theory gets a bit hairy (hence the "kind of" in the title). Essentially, what the Steelers want to do is to make their strategy "resistant" against the Packers, in the sense that no matter what the Packers do, they can't damage the Steelers' chances of making their first down. And the Packers want to set their strategy so that no matter what the Steelers do, they can't improve their chances of making the first down. Such a situation, where neither side can do any better without the other side changing what they do, is also a Nash equilibrium. The brilliant thing that Nash did, which earned him that Nobel Prize, was to show that in such games, there is always a set of mixed or pure strategies that yields a Nash equilibrium.

What follows is pretty heavy mathematical stuff. If you don't want me to go all calculus upside your head, feel free to skip it and go to the conclusion in bold, below. Here's what we do. We characterize the Steelers' strategy by p, the probability that the Steelers pass the ball, and the Packers' strategy by q, the probability that they defend the pass. From the Steelers' perspective, there are four distinct possibilities:

Steelers pass (p), Packers defend the pass (q): Payoff is 20 percent.
Steelers pass (p), Packers defend the run (1-q): Payoff is 60 percent.
Steelers run (1-p), Packers defend the pass (q): Payoff is 50 percent.
Steelers run (1-p), Packers defend the run (1-q): Payoff is 30 percent.

Putting it all together, we get an expression for the Steelers' payoff S:

S = 0.2 pq + 0.6 p(1-q) + 0.5 (1-p)q + 0.3 (1-p)(1-q)

S = 0.3 + 0.3 p + 0.2 q - 0.6 pq

We want to find the value of p that makes the partial derivative of S with respect to q equal to 0. That is, we need the value of p that makes it utterly irrelevant what the Packers do with their q (as it were).

∂S/∂q = 0.2 - 0.6 p = 0

which happens when p = 1/3. We can do a similar expression for the Packers' payoff P from their side of the payoff matrix:

P = 0.8 pq + 0.4 p(1-q) + 0.5 (1-p)q + 0.7 (1-p)(1-q)

P = 0.7 - 0.3 p - 0.2 q + 0.6 pq

and then the optimal strategy for the Packers is dictated by

∂P/∂p = -0.3 + 0.6 q = 0

which happens when q = 1/2. So the Nash equilibrium happens when the Steelers pass the ball 1/3 of the time and run the ball 2/3 of the time, and the Packers defend the run 1/2 of the time, and defend the pass 1/2 of the time. Under these strategies, the Steelers make their first down 40 percent of the time, and nothing the Steelers do on their own can increase it, and nothing the Packers do on their own can decrease it (given the payoff table). That's what makes it a Nash equilibrium.

Notice that none of the pure strategies work as well as the mixed strategies do. If the Steelers always ran the ball on third down, the Packers knew that, they would just defend the run and limit the Steelers to making first downs 30 percent of the time. It's even worse if the Steelers passed all the time; they'd make a first down only 20 percent of the time. Conversely, if the Packers always defended the run, and the Steelers knew that, they'd just pass all the time and make their first down with 60 percent efficiency. And so on.

The salient thing to take from all this, though, which I'll get into in my next post, is that although the Steelers' odds of making the first down don't depend on the Packers' strategy, at the Nash equilibrium, their best strategy does depend on how good the Packers are at defending the various options (which is represented by the payoff matrix). Although that is hardly earth-shattering in this particular case, we'll see that has interesting repercussions when trying to rate individual player achievement.

Friday, January 28, 2011

How to Be Wrong, With Statistics!

Please, just stop it. You're hurting me.

Anyone who understands statistics at all cannot dispute that Kobe Bryant does not perform well statistically, in the clutch. But anyone who understands statistics well cannot dispute that the current statistics are woefully under-equipped to discern who is the clutchiest player in the league.

Look: Nothing happens in a vacuum. We look at crunch-time statistics because it's the most exciting part of the game, when it happens. But it's only one way to condition a play.

What do I mean by condition? I mean "to restrict the characteristics of." With respect to comparing players on their clutchiosity, the objective should be to condition the crunch-time plays sufficiently that we are comparing apples to apples, and oranges to oranges. And here, as with many other aspects of basketball, we simply don't have the statistics to do it at our disposal.

For instance, suppose that we wish to compare two players, A and B. Suppose that A's offensive efficiency (points per possession) is greater than B's, with less than 24 seconds on the clock and the team tied or down no more than three points. Does that mean that A is clutchier than B?

Not at all. If B has stiffs for teammates, compared to A, then he's likely going to be faced with tighter individual defense than A, and likely earn a lower offensive efficiency than A. That's a couple of instances of "likely" in there, but the point doesn't have to be ironclad, it just has to be plausible, even probable. We just don't know enough to conclude with anything approaching certainty that A is clutchier, because we haven't conditioned on the teammates. (Or the defense, for that matter.)

Observe that this is mostly independent of what statistic you use to measure clutchiness. Suppose, instead, that you decide to use win probability increment. A player's ability to increase his team's likelihood of winning is still going to be affected by his teammates: If he passes, they will have a lower probability of scoring; if he doesn't, the defense can afford to defend him more tightly.

Of course, maybe you're OK with this kind of quality vacillating with things like which teammates a player has. But personally, I think such a measure has a certain ephemeral aspect that we don't usually associate with clutchiness.

The problem is, how can you possibly condition on the kind of teammates that a player has? Players don't change teammates the way they change their clothes (or at least they shouldn't). So what do you do?

Here's my gentle suggestion: Stop trying to answer these abstract questions statistically. I've been using outlandish forms of the word "clutch" to underscore this, in case you haven't noticed, but my point is serious. Use statistics to answer the questions they can. As the field advances, we'll be able to answer more of these questions, but in the meantime, use the same method we've been using all along: subjective observation. Western civilization didn't break down before we had PER. Nothing hinges on who people outside the game think is clutch. And mostly, stop pretending to any degree of certainty in the matter, just because a number is attached to it.

EDIT: Since I'm a fan of Kobe Bryant, one might reasonably wonder whether or not I've got a built-in bias against crunch-time statistics, since almost all of them (except perhaps a raw count of shots made in crunch time, as opposed to efficiency) point to quite a few players as being superior in the clutch. Obviously, I can't deny said bias. Quite possibly I would not be making these same arguments, or making them with quite the same degree of vehemence, if those statistics showed Bryant in a better light.

That being said, however, I don't think the question of using statistics to examine clutchitude should be predicated on how well they accord with conventional wisdom (where Bryant is, indeed, king of clutch). In my opinion, there are quite compelling fundamental arguments that straightforward linear classifiers such as PER or offensive efficiency or wins produced, conditioned on crunch time or not, are simply not reliable indicators of individual performance, and those arguments would remain valid regardless of whether I espoused them, or of whom they revealed to be the top performers, in crunch time or in the game overall.

Wednesday, January 5, 2011

Voter Mixing Equals Criterion Mixing

I'm going to talk about basketball and probability again. Wasn't that obvious from the title of this post?

It's apparently never too early to talk about the MVP award for the NBA. We're coming up on the halfway point of the season, and writers have been tracking the MVP candidates for, oh, about half a season. Nobody takes them seriously until about now, though.

One side effect of the question being taken seriously is that some wag will point out that the MVP is not—and has never been—defined precisely. In fact, I can't find anywhere where it's been defined at all by the NBA, precisely or otherwise. That leaves the voters (sportswriters and broadcasters, mostly, plus a single vote from NBA fans collectively) to make up their own definition, a situation that said wag invariably finds ludicrous.

Well, here's one wag that finds this situation perfectly acceptable. Desirable, even.

Listen: There is no way that everybody will ever agree on a single criterion for being the "most valuable player." Most valuable to whom? The team? The league? The fans? Himself? (I can think of a few players who certainly aim to be most valuable to themselves.) And what kind of value? Wins? Titles? Highlights? Basketball is entertainment, after all. There are just too many different ways to evaluate players.

Instead, we might imagine that some writers would get together at some point and define MVP as a mixture of criteria. For instance, the title of MVP could be based in equal parts—or inequal parts, for that matter—on individual output, contributions to team success, and entertainment value.

Except, I'd argue that that is exactly what we've been doing for all these years. We have all these voters, all of whom have differing ideas of what the MVP does (or should) stand for. Some people think it should be based on individual statistics (Hollinger's Player Effectiveness Rating, or PER, is a current favorite). Some people think it should be based, at least in part, on team success, so team wins are an input to the decision (a 50-win minimum is a popular threshold). Still others dispense with explicit criteria altogether and vote based on reputation or flash.

Well, if exactly the same number of voters take each of those different perspectives on MVP, then we will have an MVP based in equal parts on individual output, contributions to team success, and entertainment value. And if more voters lean on individual output than on entertainment value, then the MVP make-up will show that same leaning. Voter mixing equals criterion mixing!

What's more, this criterion mixing is automatic. No committee needs to be formed, and the exact mixture evolves as the voter population evolves. If someday team success becomes more important to the basketball cognoscenti, then it'll automatically have a larger impact on MVP selection. No redefinition is necessary.

Can this equivalence be demonstrated on any kind of formal level? In something as complex as basketball, my guess is not. But it's close enough, and intuitive enough, that I think it just doesn't make sense to gripe about the MVP lacking a precise definition. As long as each voter comes to their own decision about what it stands for, we'll get the mix that we should.

Wednesday, November 24, 2010

Too Many Damned Monkeys

What do you need more monkeys to do: (a) guarantee the writing of all of Shakespeare's plays, or (b) be able to sink an infinite number of basketball shots in a row? OK, I realize that this is entirely inconsequential, but it actually came up a couple of days ago in what would otherwise have been fairly ordinary coffeehouse conversation, so let me bring you up to speed.

The anchor point is the notion that by having an infinite number of monkeys, each of them sitting in front of a typewriter, randomly typing away, you could guarantee that one of them would surely generate a perfect typescript of Hamlet. Or Macbeth. On the other hand, you'd also guarantee that one of them would generate a "perfect" version of Astrology for Dummies.

What this is really about (since few of us are likely to corral together an infinite number of monkeys) is the so-called cardinality of possible books of arbitrary (but finite) length. Now what's cardinality? The cardinality of a finite set is simply the number of things in the set. So, for example, the cardinality of the U.S. Supreme Court justices is nine, usually. The cardinality of the English alphabet is 26. And the cardinality of the sand grains on the Earth is some almost unimaginably large number. But it's still finite.

Infinite sets are a whole 'nother kettle of fish. Maybe the simplest example of an infinite set is ℕ, the set of natural numbers: 0, 1, 2, ... We use the ellipsis (...) to indicate that the natural numbers go on, forever, without end. There is no last number; in other words, infinity is not really a number in the usual sense. Nonetheless, we might say that the cardinality of ℕ is infinity, which is conventionally denoted ∞.

But in so doing, we would be ambiguous, for as it turns out, there are different varieties of infinity. The infinity of ℕ is the smallest possible infinity, but there are larger infinities. That sounds kind of paradoxical: How could a set go on longer than forever?

Well, let's see if we can construct an infinity that's larger than the cardinality of ℕ. The first thing we might do is add some more numbers to ℕ and see if that yields a set with larger cardinality: we might add in all the negative whole numbers, to get ℤ, the set of all integers. Shouldn't ℤ, which is (naively) almost twice as big as ℕ, have nearly twice as large a cardinality?

No, and here we run into one of the fundamental differences between finite sets and infinite sets. Suppose we divide ℕ into two mutually distinct subsets: O (1, 3, 5, ...) and E (0, 2, 4, ...). Intuitively, both O and E are infinite sets. But if ℕ is the union—the sum set, so to speak—of O and E, is ℕ then doubly infinite?

Mathematicians decided that was too much. So cardinality is defined, less intuitively but more consistently, as follows. We say that the cardinality of the English alphabet is 26, because there are 26 letters in the alphabet. Another way of saying the same thing is that the letters of the alphabet can be placed into a one-to-one correspondence with the set of numbers from 1 through 26: 1-A, 2-B, 3-C, and so on, up to 26-Z. You can try a similar exercise with the U.S. Supreme Court justices.

If we define the notion of cardinality this way, then it follows that two sets have the same cardinality if there exists a one-to-one correspondence between the sets. Somewhat amazingly, then, the set of odd numbers O has exactly the same cardinality as ℕ, because one can define a one-to-one correspondence that matches each number in ℕ with a number in O, and vice versa: 0-1, 1-3, 2-5, 3-7, ..., in each case pairing a number n from ℕ with the number 2n+1 from O. It doesn't matter that one can define a correspondence in which the two sets don't match one-to-one; all that matters it that there exists at least one correspondence where they do match.

Pretty clearly, we can do the same thing with E, matching n from ℕ with 2n from E. So all three sets—ℕ, O, and E—have the same cardinality, even though O and E combine to make up ℕ. The question then arises: Are there infinite sets that can't be matched up one-to-one with ℕ, no matter how you try? We can certainly do that for ℤ, matching up all odd numbers m in ℕ with (-1-m)/2 from ℤ, and all even numbers n in ℕ with n/2 from ℤ.

Well then, what about ℚ, the set of rational numbers—all possible fractions involving only whole numbers in the numerator and denominator? Surely that is a bigger set. But as it turns out, ℚ also has the same cardinality as ℕ, even though there are an infinite number of possible numerators and an infinite number of denominators. This state of affairs has led people to write such semi-sensical equations as

∞ + ∞ = ∞

since O and E combine to make ℕ, and

∞ × ∞ = ∞

since all the infinite pairings of ℕ make up ℚ. (By the way, in case you're wondering, ℕ stands for Natural Numbers, of course; ℤ stands for Zahlen, the German word for number; and ℚ stands for Quotient.)

All right, what about ℝ, the set of real numbers? Can that set be placed into a one-to-one correspondence with ℕ? Based on the way things have been going, you might suppose that they could, but in 1891, the German mathematician Georg Cantor (1845-1918) showed that in fact they could not, that ℝ was a strictly larger set than ℕ.

His argument was clever one, employing proof by contradiction. Suppose, Cantor said, that you could find such a one-to-one correspondence. You could write out a catalogue of real numbers then, as follows:

1 - 0.14159265...
2 - 0.71828182...
3 - 0.41421356...

and so forth. Now, suppose you construct a new number g, using the following process: The first digit of g will be the first digit of the first number in your catalogue, plus one; the second digit of g will be the second digit of the second number, plus one; the third digit of g will be the third digit of the third number, plus one; and so on. We could read out g along a diagonal in our catalogue of real numbers, like this:

1 - 0.24159265...
2 - 0.72828182...
3 - 0.41521356...

So g would be the number 0.225... This number g has an amazing property—it cannot appear anywhere in our catalogue of real numbers. Why not? Because it differs from the first number at the first digit, it differs from the second number at the second digit, it differs from the third number at the third digit, ... in short, it differs from every single number in the catalogue.

We have a contradiction: Either g is not a real number, or our catalogue is not complete as we thought it was. Well, g is clearly a real number, so the problem must lie with the other part—our catalogue is not complete. After all, we only assumed we could create such a catalogue. Since it seems we cannot, no one-to-one correspondence exists between ℝ and ℕ.

You might think that there's a simple way around this, if we simply add g to our catalogue, or rearrange it in some way. But Cantor's diagonalization argument, as it is usually called, would apply just as well to this new catalogue. No matter what catalogue you attempt to compile and amend, there's no way to avoid the construction of a real number that's nowhere in the list. Those two sets fundamentally have different cardinalities. And because of that, we can't use the single symbol ∞ to denote their cardinalities. Instead, mathematicians use the aleph-numbers: The cardinality of ℕ is ℵ0 (pronounced "aleph-null"), and under certain commonly held assumptions, that of ℝ is ℵ1 (pronounced "aleph-one").

So what about all those scripts for Shakespeare? Each of them can clearly be entered into a computer document, which is represented by a finite string of digits in the computer. We can therefore place the set of possible scripts into a one-to-one correspondence with the integers in ℕ, meaning that the set of scripts has cardinality ℵ0, so ℵ0 monkeys would be enough for at least one monkey to write any given script. (In fact, ℵ0 monkeys would write that script.)

But what about the infinite string of makes in a basketball game? These are infinitely long strings of basketball shots (each one with ℵ0 shots), so there would be a one-to-one correspondence between those strings and infinitely long sequences of digits—i.e., ℝ, the reals. So it would take ℵ1 monkeys to guarantee that at least one monkey would shoot any given sequence (in particular, the one sequence consisting of all makes).

I don't even want to know about the bananas.

Friday, September 17, 2010

An Unusual Series

Which may not be all that interesting to you, unless you're interested in recreational math. For lots of you, that may be sort of an oxymoron. (Although, I'm hoping it's less likely among readers of my blog than it would be among the general population.)

Here's the idea. Start with an integer. Add its digits together. If that sum is even, halve the number (not the sum of digits) to get the next number. If the sum is odd, add one to the number.

For instance, suppose we start with the number 10. Its digits sum to 1+0 = 1, so we add 1 to get 11. Those digits sum to 1+1 = 2, so we halve it to get 11/2 = 5.5. Those digits sum up to 5+5 = 10, so we again halve the number to get 2.75. Those digits sum up to 2+7+5 = 14, so we again halve the number to get 1.375...well, I think you get the idea.

On the other hand, suppose you start out with the number 1. Its one digit sums to 1, so we add 1 to get 2. Its single digit sums to 2, so we halve it to get 1 again. Obviously, this series repeats forever: 1, 2, 1, 2, 1, etc.

The first eight numbers, 1 through 8, all end up at that same repeating sequence. The next number, 9, leads immediately to 10, which starts out as I worked out above, and then goes on indefinitely: Each number has one more digit after the decimal point than the preceding number, so the series never repeats, and it never reaches zero, either.

In my limited trials, every integer I've started out with either ends up with the repeating sequence 1, 2, 1, 2, 1, ..., or else it eventually merges with the same series that you get if you start with 10 (or 9, for that matter). So, two questions for those of you who might like to play with this kind of thing:

Is it true that the series for any integer always either ends with the sequence 1, 2, 1, 2, 1, ..., or else merges with the series that starts with 10?
Consider the series that starts with 10. As we said, it goes on forever, without repeating. What is the average of the numbers in that infinite series?

Neither of these questions can be answered definitively (as far as I can tell) with brute-force computation, although the results might be suggestive. If you do want to try some computations, use an infinite-precision package; our friend Bernie has already tried it with ordinary floating-point numbers (eight-byte doubles, I think), and roundoff error rendered everything after about the 15th number quickly invalid.

P.S. Don't ask me how I got started thinking about the series. It's inspired in part by this guy, but I've already forgotten how I decided to think about this variant.

Friday, September 3, 2010

Grasping at Genius

No, this isn't about me trying to become a genius. My aim is a lot more modest: trying to draw a bead on what genius is. Partly this is motivated by my last post about music, but mostly it came out of a discussion I had several years ago with a co-worker over whether athletes could be geniuses at their sport. I thought they could, and he thought not. He conceded that they had some outstanding skill, but felt that it would be demeaning the word "genius" to call it that. I was willing to be a bit more expansive with the term. One does have to be a little careful—probably half the parents out there think their precious little ones are geniuses—but limiting genius to a specified list of fields seemed unnecessarily restrictive to me.

The discussion more or less had to end there because we never really grappled with the larger issue of what genius really is, and without that any debate over whether it means anything in sports is putting the cart before the horse. I want to tackle that now, so I can go back and win the original argument.

First of all—because I'm sick and tired of hearing about it, even now—what is genius not? It is not a high IQ, or intelligence quotient. Lots of folks are intimidated by numbers (especially, but not exclusively, those who do not feel comfortable around them), to the point that any description using them feels more objective and unassailable. Well, they might be that, but what's lost when a number is attached to anything is the process by which that number was derived. If you don't know and understand that process, the number—while not exactly meaningless—is not as reliable as it sounds.

In the case of IQ, the formula is generally straightforward; what's not so clear are the principles on which questions are selected for IQ tests. If you've ever taken one, you know that questions on such tests are fairly narrowly circumscribed: which one of these things doesn't belong, how many blocks are there, numerical or word analogies, etc. The only thing that we can be sure IQ tests measure is how well someone takes IQ tests. Beyond that patently circular assertion, it gets hazy. Does it measure intelligence? How about genius? There are lots of folks who have very high IQs (Marilyn vos Savant—really? that kind of name?—comes to mind) who nonetheless evince no obvious signs of genius. To her credit, vos Savant doesn't make any claims of genius for herself.

If we can't rely on a test to identify genius, we are back to Potter Stewart's famous dictum (in his concurring opinion in Jacobellis v. Ohio regarding hard-core pornography): "I know it when I see it." So where do we see it?

If we start with the so-called hard sciences (physics and chemistry), plus mathematics, I think you'll find little argument that folks like Archimedes, Isaac Newton, Carl Friedrich Gauss, and Albert Einstein were geniuses. Expand that to all of letters and sciences, and you embrace other noted geniuses, such as Charles Darwin, Louis Pasteur, and B.F. Skinner. But maybe these get a little dicier. These are great scientists, to be sure, but what about them promotes them beyond the ordinary rabble?

You might expect that things would get dicier still when we go to the fine arts, but at least in my experience I find less argument about ascribing genius to artists like Leonardo da Vinci (also an engineer), William Shakespeare, Auguste Rodin. How about musicians? Ludwig van Beethoven, Richard Wagner, and Igor Stravinsky all wear the mantle of genius, and wear it rather comfortably at that. (Yes, I realize these are all dead white dudes. I'll get to that in a moment.)

Let's pause a while and take stock of what we have. Accepting for the sake of discussion that these people are all geniuses, what makes them so? They don't just do what ordinary people in their professions do, only better—although by and large, they do do those things better. They also don't just do what ordinary people can't do—although, again, they do do that, too. What sets them apart is that they do things that ordinary people in their profession could never even conceive of, before the geniuses did. Arthur Schopenhauer put it this way:

"Talent hits a target no one else can hit; genius hits a target no one else can see."

I must emphasize that innovation is a vital part of this. One of Newton's most important contributions to physics was a mathematical demonstration of the law of universal gravitation (the so-called "inverse square law" of gravitation) from Kepler's observations and laws of planetary orbits. That same law is derived countless times over by students in undergraduate physics classes around the world (albeit using analysis, rather than the essentially geometrical means that Newton employed). That doesn't mean that any of them, let alone each of them, is a budding Newton, for likely none of them, plucked at birth and set down in a pre-Newtonian world, could have done what Newton did. Newton's genius lay in blazing the trail that future scientists and students would follow.

In that context, then, let me add a few other names to the list: Charlie Parker, Miles Davis, Herbie Hancock. Jazz is an art form, among others, that combines composition and performance in a single moment, adding for the first time—to my list, anyway—the element of dynamism. (I don't mean to slight other performance geniuses, such as actors and stand-up comedians, but I'm trying to make a point!) Although jazz tunes are composed to a certain extent, a fundamental aspect of jazz performance is improvisation. No two jazz performances are ever exactly the same—not, at any rate, to the extent that classical music performances are alike. The music is constantly written and rewritten by each new performer that approaches it, and each new performer must contend not only with the structure of the music, but with the performers around him or her, in an endeavor that is, in the best of cases, at once collaborative and competitive. And genius denotes the ability, moment to moment, to conceive and perform what others in that situation could not even imagine.

From that point, how far of a step can it be to arrive at sports? I'm going to talk about basketball, because it's the sport with which I'm most familiar, but similar arguments could be made for other sports. (Imagine, for instance, the shots that Tiger Woods can execute that others would never even attempt, or the sudden volley, deft but fierce, of Pete Sampras.) Basketball, like jazz, requires the constant attention of the athlete to the ever-changing state of the game, from the highest level down to the smallest detail, and the ability to respond to that state, all on the spur of the moment. Where's that pick going to be in five seconds? What are the possible tactical options available to me, given the current score and time remaining? Seeing the passing lane halfway down the court is a geometric exercise in negotiating tangled world-lines in the four dimensions of space and time; to actually complete the pass, when everyone else is watching, one must summon the legerdemain of a practiced conjurer.

We think of sports as an essentially physical activity (which is probably why my co-worker could never attach the genius label to an athlete), but in its own way it is as demanding on the intellect as the most abstruse mathematical theorem, and unlike the mathematicians, who can return now and again to their labors when it suits them, the athlete has only the splittiest of split-seconds to act—or else the instant is gone. Who are we to say that genius could not act here, as well as anywhere else?

We may debate whether or not Wilt Chamberlain, Michael Jordan, or Magic Johnson merit the label of genius, whether or not what they do exceeds the conception of their colleagues. But not, in my opinion, whether the question makes sense. Even we non-geniuses can see that, I think.