The Null Hypodermic: basketball

Showing posts with label basketball. Show all posts

Friday, September 20, 2019

Misunderstood Rules in Sports, Part One of a Trillion

Because I apparently don't have enough random crap on my plate, I occasionally participate on Quora. I'm there as Brian Tung; I'm not hard to find, other than you actually have to want to find me, and so far, that's not a very common thing.

Anyway, I often find myself embroiled in various debates (generally well-mannered, if not always good-natured) about various sports rules. Most recently, the question was about passes or shots that go over the backboard. For example, should this shot from 2009 by Kobe Bryant count?

Or how about this one from Jamal Murray, in 2019?

The common feeling is that these should not count, because the ball goes over the backboard, and everyone knows that a ball that goes over the backboard is out of bounds, right?

Right?

Well, it's complicated. Complicated enough that I'm just going to drop this here for the next time this comes up. Here's Rule 8, Sections II.a and II.b from the official NBA site:

a. The ball is out-of-bounds when it touches a player who is out-of-bounds or any other person, the floor, or any object on, above or outside of a boundary or the supports or back of the backboard.

This part of the rule is about what the ball touches, not where it goes. There's a bit of excitement in that it uses the word "above," but in context, I think it's pretty clear that it refers to the ball touching something or someone above the boundary (the out-of-bounds line).

b. Any ball that rebounds or passes directly behind the backboard, in any direction, or enters the cylinder from below is considered out-of-bounds.

This is the relevant part. Note that it uses the wording "directly behind the backboard." To me, that means you take the backboard, and project it back away from the court; anytime the ball passes through that imaginary three-dimensional box, it's out of bounds. It says nothing about the ball passing over the backboard. If it meant that, I think it would have said that.

In both cases, the ball clearly goes over the backboard, but it never goes directly behind the backboard. In the case of Kobe's shot, the best angle in this video (pretty poor resolution, but it was the best I could find) is found at about 0:48. As for Murray's shot, well, read on.

I think the phrase "directly behind" is crucial. It isn't enough that the ball go behind the plane of the backboard (which is four feet inside the baseline, so that would happen all the time). It has to go somewhere where, if you were to look from the opposite baseline, you would see the ball through the backboard, not around it.

If you go online, you will see a majority of the web sites that discuss this question insist, quite authoritatively, that such shots are not to be counted. As irritating as I sometimes find this, it's sort of understandable, because the wording of the rule is a bit terse, and also because the rules vary from governing body to governing body, as well as era to era. For instance, these shots would be illegal in the NCAA:

Rule 7-1-3. The ball shall be out of bounds when any part of the ball passes over the backboard from any direction.

This rule is stated again, almost verbatim, as Rule 9-2-2.

On the other hand, they're legal in FIBA:

Rule 23.1.2. The ball is out-of-bounds when it touches:

A player or any other person who is out-of-bounds.
The floor or any object above, on or outside the boundary line.
The backboard supports, the back of the backboards or any object above the playing court.

So there's some excuse for getting this wrong (plus they eschew the Oxford comma, but that's another blog post for another time). If that's not enough, the rule in the NBA has changed—see the postscript below.

Fortunately, we have an approved ruling, from none other than Joe Borgia, NBA Senior Vice President of Replay and Referee Operations (I'll bet you already knew that):

Joe Borgia, NBA Senior Vice President of Replay & Referee Operations, joined @NBATV to discuss three plays from Sunday's NBA Playoff action:
- Butler charge in Q1 of #TORatPHI
- Gasol offensive foul in Q4 of #TORatPHI
- Murray shot over backboard in Q1 of #DENatPOR pic.twitter.com/5Lto9JNxOr
— NBA Official (@NBAOfficial) May 6, 2019

Jamal Murray's shot is discussed as the third case, at about 1:38 of the video.

"...When you look at this angle, our rule is the ball cannot pass directly behind the backboard. So when you saw that replay, you saw the ball went up, and it went over, but it never went directly behind it. Otherwise, we would have seen it through the glass; that would have been illegal. But up and over is fine, so that is a good basket."

I think that should settle the matter fairly nicely.

---

Here's more from Borgia:

"The old rule stated it was illegal when the ball went over the backboard (either direction). So imagine the backboard extending up to the roof—if the ball bounced off the rim and hit any part of the imaginary backboard a violation was assessed. We had too many game stoppages when the ball bounced over the edge so we changed the rule to say the ball cannot go directly behind the backboard. That is why I said the backboard is now an imaginary ‘tunnel’ that goes back, not up to the roof like in the old rule."

Monday, June 2, 2014

Fine, I'll Take It

So, this happened. And I have to wonder—are we supposed to be impressed by this fine? Because I'm pretty sure Phil Jackson isn't.

I don't know if Phil was aware that this was a violation of league rules. I kind of suspect that he was; it doesn't strike me as the sort of thing he'd do without even considering whether it broke the rules. I don't say that just because I'm somehow impressed with his knowledge of league restrictions. I say it because this tampering makes sense strategically.

Listen: The Clippers are going to be sold for somewhere in the neighborhood of $2 billion. If you didn't hear that correctly, do not pass GO, just return to the beginning of this paragraph. Two billion dollars. The Clippers. I really admire (I won't go so far as to say "love" or even "like") the current incarnation of this team. They hustle, they want to win, and for once, they have the talent to do it. They remind me of the Lakers in the late 1990s. But even the Lakers of the 1990s had some history. What do the Clippers have?

And yet a Microsoft CEO, whose previous claim to Internet fame was a clip in which he repeated the word "developers" approximately a zillion times, but who otherwise doesn't actually seem insane, felt the Clippers were worth $2 billion. (Sorry if this grosses you out.)

Against that backdrop, consider what Phil Jackson has to gain by mentioning Derek Fisher's name in advance of the Thunder's ouster from the Western Conference Finals: Fisher now knows that he's wanted, on the short list for the Knicks job. Is Fisher the best man for the job? I don't know. He has a reputation for clutch (built in part upon this shot), he's earned respect from much of the league outside of Salt Lake City fans, and he's done it with seemingly very little in the way of natural physical gifts. He's not a preternatural baller the way his longtime backcourt mate Kobe Bryant is. It's quite conceivable that he could turn out to be a successful NBA coach. Given the Knicks' recent history, that bar is not set excessively high. Jackson's words have made it a bit more likely that Fisher will lean toward New York than he would have otherwise.

So let's suppose that the Knicks are currently worth as much as the Clippers are, that their current state of basketball inferiority is compensated for by the fact that they are New Friggin' York. The team finished with 37 wins this past season, a .451 clip. How much do you think they'd be worth if they finished at .500 (41 wins)? How much if they finished at .600 (49 wins)? I think conservatively, the team would increase their net value by at least $10 million per additional win to start with, and each successive win would only increase that margin. And Jackson's supposed to be worried about $25,000?

Admittedly, Jackson doesn't get all of that increase in value. That's James Dolan's. Still, Dolan has to pay Jackson, and he'd be a lot happier about paying Jackson if his team were suddenly worth $100 million more. The more candidates Jackson has to choose from, the more likely it is that the team will make that leap. That's the real value of the so-called tampering with Derek Fisher: It makes it more likely that Jackson will have him to choose from. Nothing in his words binds him to choose Fisher at all. There's very little downside, compared to that negligible $25,000 fine.

So what's it worth, exactly? I'll take a look at that in a future post, but for now, I'm confident Phil Jackson knows what he's doing.

Wednesday, April 24, 2013

When We Flew

[Another Facebook post cannibalized for National Poetry Month. At the time I wrote this, Kobe had not yet suffered his season-ending Achilles injury.]

I was watching yet another YouTube clip of Kobe wowing us with his athleticism and wizardry, and I started thinking about how many of the highlights were in another century. Hard as it may seem to believe at the moment, there will come a day when Kobe will no longer be able to dunk. It might not come this decade—hell, if MJ is any indication, it might not come the next, either—but it will come.

Anyway, I started getting a bit depressed about that, and so as if to bring myself out of that funk, I started scribbling some lines. And I found that it actually sort of helped, a little. I hasten to emphasize that all this has nothing at all to do with the fact that a birthday is coming up, or anything like that. That is so a coincidence.

It may read as though it's about other things, and it can be. But I really did write it with basketball in mind.

when we flew

When we flew,
we made legends.
We startled and we stunned,
and foes grasped at us in vain.
Our wings would never tire,
and our lungs never fail.
The world lived a thousand times
and never knew how close it had come,
and all because we flew
     when we flew.

When we flew,
time stood to watch,
then travelled back to watch again,
hardly daring to believe.
Space cleared space for us,
and light held us in her gaze.
The stars shone their mute fanfare
shattering their crystal spheres,
and all because we flew
     when we flew.

Now we stand,
make way while children soar.
We wear our pride like envy,
and dress our unease in longing.
We envision battles we will never fight,
and so we shall never lose.
A thousand times we'll close our eyes and ears
and sip champagne from glass slippers,
and all because we flew
     when we flew.

Copyright © 2012 Brian Tung

Thursday, March 21, 2013

Mad as March

Note: This post has been updated to correct some of the probability figures, and to mention UMBC's defeat of Virginia in the 2018 tournament.

Hey, it's March, it's mad, it's March Madness!

Which means that there's math, too.

I was inspired to write math this time (as opposed to all those other times) by a short video in which some mathematics guy explained why filling out a perfect bracket (ignoring the "First Four" and focusing only on the 64 teams in the "real" tournament) is so hard. According to him, it's because there are 63 games, each one eliminating one of the 64 teams, each of which has to be prognosticated correctly in a perfect bracket. The number of possible brackets is therefore 2 to the 63rd power, or about 9 times 10 to the 18th power. And so the odds of filling out a perfect bracket is 1 in that enormous number. Even if everyone in the whole world filled out a bracket, the odds are still a billion to one against anyone getting it all right.

As viewers too numerous to list pointed out (correctly), this line of reasoning is entirely bogus because it assumes that each of the 63-game sequences is equally likely. Of course they aren't. Higher seeds are more likely to win their games than lower seeds. In particular, in the 28 years they've had 64 teams in the tournament, no 16 seed has ever beaten a 1 seed. That doesn't mean it's impossible, or that it'll never happen, only that it's very unlikely on a game to game basis. Eventually, though, it's inevitable, provided the tournament goes on year after year. (ETA 2024-03-29: Sure enough, it's happened twice in the last several years. In 2018, the 16th seed University of Maryland Baltimore County Retrievers beat the top seed Virginia Cavaliers by twenty, 74–54, and in 2023, the 16th seed Fairleigh Dickinson Knights beat the top seed Purdue Boilermakers by the more sedate score of 63–58. Both Cinderellas went on to lose their second-round matchups, though.)

The upshot is that certain brackets are more likely to be correct than others, and I've even heard tell that people have filled out perfect brackets in the past. So I wondered to myself: What are the odds of someone filling out a perfect bracket?

To estimate that (because this really isn't something you can determine empirically), I had to construct a model for simulating tournaments. This has to be done because although there are plenty of statistics for how often a 5 seed beats a 12 seed (because that pairing always meets four times in the first round, once for each of the four regions), there aren't going to be statistics for how often a 5 seed beats a 14 seed, because that could only happen in the regional finals, after both teams have defeated three teams (mostly teams better than they are), which is highly unlikely. In fact, I'm not sure that it's ever happened. I needed a model that would be reasonably simple to calibrate, quick to evaluate, and was generally applicable to any pair of seeds.

The model I decided upon, fairly quickly, works as follows. Each team has a certain "strength," which depends only on its seeding. Then, if teams A and B meet, one with strength SA and the other one strength SB, the probabilities of each of them winning are given by

P(A wins) = SA / (SA+SB)
P(B wins) = SB / (SA+SB)

Fairly straightforward. (ETA 2024-01-21: This turns out to be the Bradley-Terry model. I'm sure it's been independently reinvented many times, since it's such a natural idea.) I calibrated it by finding statistics on how often teams of different seeds made it to various stages of the tournament. I was a bit surprised, actually, by some of the statistics. I had imagined that the chances of winning the first-round game would decrease very little from the 1 seed down to about the 4 seed or so, and then accelerate quickly down to about the 13 seed, and then decrease very slowly again. (I was aware that there were some seeds that historically have won more often than you might expect, such as the 12 seed, but I assumed those were statistical anomalies that one could reasonably expect to show up in only thirty years of history. In particular, I did not want to assume that 12 seeds, for instance, were magically better than 11 seeds, or even 5 seeds. I assumed seeds properly reflected relative talent.)

But no such pattern appeared. Instead, the probability of winning the first round seems to decrease fairly steadily from the 1 seed down to the 16 seed. The 12 seed teams do seem to win slightly more than you might expect, but they win no more often than do 11 seeds. So here, at least, the 12 seed bump wasn't great. (EDIT: Ha! Both 12 seeds playing on the first day of the 2013 tournament won: the Oregon Ducks, and my California Golden Bears. Further EDIT: And now 12th-seeded Ole Miss. And 13th-seeded La Salle, for that matter.)

While I was trawling for these statistics, by the way, I also came upon the assertion that the way teams were seeded (and in particular, not re-seeded after each round) placed a penalty on the middle seeds, so that the 12 and 13 seeds, it was claimed, were in some cases likely to advance further than the 8 and 9 seeds, for instance. I thought it might be interesting to see if that came out of the model.

Anyway, as a result of these statistics, I came up with the following strengths for the 16 seeds:

1, 1/2, 1/3, 1/4, 1/5, 1/6, 1/7, 1/8, 1/9, 1/10, 1/11, 1/13, 1/17, 1/25, 1/41, 1/73

You may notice that these strengths imply that a 16 seed will win the first round 1/74 of the time. (But they never win, I can hear you saying. Nonsense. You're only saying that because no one ever has. ETA: It happened! In 2018, the 16 seed University of Maryland, Baltimore County defeated the 1 seed Virginia. As of this writing, before the 2023 tournament, I think that makes the 16 seed one for 148, or exactly half as often as predicted by my wholly ex recto model.) As you can see, the top 11 seeds have strengths that decrease harmonically—that is, as the reciprocal of the seed. After that...well, maybe if you're dorky enough, you'll see what the pattern is.

To be sure, I don't really know how accurate this model is, largely because (as I mentioned previously) statistics for many of the match-ups just don't exist, or at least don't exist in sufficient quantity. But since I'm taking the results with a grain of salt (I only need rough order-of-magnitude estimates), it only has to be moderately accurate for it to do what I want.

Anyway, my simulation engine runs through each of the 32,768 possible regional brackets (each of the regions is identically seeded) and determines how likely each bracket is, and how far each seed got in that bracket.

The results were sort of interesting. In the first round, there are no notable oddities, which makes sense because the better teams always play against worse teams, so the higher a team is seeded, the more likely it is to win the first round.

1 seed wins First Round with probability 0.986486
2 seed wins First Round with probability 0.953488
3 seed wins First Round with probability 0.892857
4 seed wins First Round with probability 0.809524
5 seed wins First Round with probability 0.722222
6 seed wins First Round with probability 0.647059
7 seed wins First Round with probability 0.588235
8 seed wins First Round with probability 0.529412
9 seed wins First Round with probability 0.470588
10 seed wins First Round with probability 0.411765
11 seed wins First Round with probability 0.352941
12 seed wins First Round with probability 0.277778
13 seed wins First Round with probability 0.190476
14 seed wins First Round with probability 0.107143
15 seed wins First Round with probability 0.046512
16 seed wins First Round with probability 0.013514

(Remember to take all those digits with a sizable grain of salt.) The second round is where it gets interesting. Consider a moderately lower seed, like the 12 seed. If it happens to win its first round, its second round game will be against either the 4 seed or the 13 seed. It's likely to be against the 4 seed, but not overwhelmingly so; it will be the 13 seed about 19 percent of the time. In that latter case, the 12 seed will actually be a slight (57 percent) favorite in the second round. But even if it meets up against the 4 seed, it will be an underdog, but not a prohibitive one. In such match-ups, the 12 seed beats the 4 seed about 24 percent of the time.

Compare that to one of the middle seeds—say, the 8 seed. If it reaches the second round, it has to play against either the 1 seed or the 16 seed. Not only is the 1 seed overwhelmingly likely to win its first round match-up, but it is also a much stronger opponent for the 8 seed than the 4 seed was for the 12 seed. As I mentioned above, this happens because the teams are not re-seeded after each round, with the best of the surviving teams playing the worst of the surviving teams, the second best playing the second worst, and so on. The moral of this story is that as far as the Sweet Sixteen is concerned, the pundits are right: The middle seeds are (slightly) cursed!

1 seed reaches Sweet Sixteen with probability 0.882035
2 seed reaches Sweet Sixteen with probability 0.763414
3 seed reaches Sweet Sixteen with probability 0.632753
4 seed reaches Sweet Sixteen with probability 0.496767
5 seed reaches Sweet Sixteen with probability 0.366148
6 seed reaches Sweet Sixteen with probability 0.248486
7 seed reaches Sweet Sixteen with probability 0.148009
8 seed reaches Sweet Sixteen with probability 0.064476
9 seed reaches Sweet Sixteen with probability 0.052084
10 seed reaches Sweet Sixteen with probability 0.080832
11 seed reaches Sweet Sixteen with probability 0.093788
12 seed reaches Sweet Sixteen with probability 0.082892
13 seed reaches Sweet Sixteen with probability 0.054193
14 seed reaches Sweet Sixteen with probability 0.024973
15 seed reaches Sweet Sixteen with probability 0.007745
16 seed reaches Sweet Sixteen with probability 0.001405

Now that I think about it, I think it's this bump for the 10 through 13 seeds that accounts for their "upset" reputation, more than their success in the first round.

The trend continues, though with decreased intensity, in the third round, in order to reach the Elite Eight (although notice that there's a bump at both the 6 seed and the 10/11 seed)...

1 seed reaches Elite Eight with probability 0.732698
2 seed reaches Elite Eight with probability 0.510341
3 seed reaches Elite Eight with probability 0.302688
4 seed reaches Elite Eight with probability 0.127560
5 seed reaches Elite Eight with probability 0.081095
6 seed reaches Elite Eight with probability 0.081461
7 seed reaches Elite Eight with probability 0.056441
8 seed reaches Elite Eight with probability 0.025441
9 seed reaches Elite Eight with probability 0.019169
10 seed reaches Elite Eight with probability 0.024748
11 seed reaches Elite Eight with probability 0.020596
12 seed reaches Elite Eight with probability 0.009123
13 seed reaches Elite Eight with probability 0.004812
14 seed reaches Elite Eight with probability 0.002918
15 seed reaches Elite Eight with probability 0.000807
16 seed reaches Elite Eight with probability 0.000101

...but interestingly, it's essentially gone by the time one reaches the Final Four.

1 seed reaches Final Four with probability 0.535913
2 seed reaches Final Four with probability 0.222277
3 seed reaches Final Four with probability 0.106313
4 seed reaches Final Four with probability 0.053660
5 seed reaches Final Four with probability 0.030044
6 seed reaches Final Four with probability 0.018613
7 seed reaches Final Four with probability 0.011601
8 seed reaches Final Four with probability 0.006982
9 seed reaches Final Four with probability 0.004848
10 seed reaches Final Four with probability 0.003929
11 seed reaches Final Four with probability 0.003042
12 seed reaches Final Four with probability 0.001761
13 seed reaches Final Four with probability 0.000753
14 seed reaches Final Four with probability 0.000221
15 seed reaches Final Four with probability 0.000039
16 seed reaches Final Four with probability 0.000004

There is a bit of an inflection in the curve at about the 10 seed, but it's still monotonically decreasing across all seeds.

Now, one might think that this is just an artifact of the strengths I chose, somewhat arbitrarily, and that either the trend itself, or its vanishing by the Final Four, might not arise with a different set of strengths. For what it's worth, I thought that myself, and tried running the simulations with different sets of strengths.

As it turns out, with any reasonable set of strengths that I chose, the trend of the middle seeds being somewhat worse off at the Sweet Sixteen than the moderately lower seeds—that trend might arise or not, but if it did, it always disappeared by the Final Four. I didn't find any set of strengths which retained the trend all the way through to the Final Four. Heuristically, I think this is because by the time you get to the regional finals, you have to play the best teams in the entire bracket—you can't avoid anyone for sure. So if your goal is to reach the Final Four, then have no fear: I don't think the lack of re-seeding significantly hurts you if you're in the middle seeds. Your answer may be different, of course, if your goal is simply to make it to the Sweet Sixteen.

Oh yes, the question that first stimulated this exploration: What are the odds of filling out a perfect bracket? Well, given the strengths as I initially had them, the most likely bracket is the one where all the favorites win every game. For each region, this happens about one time in 123; since there are four regions in the entire bracket, the odds on the ultimate "chalk" bracket are that raised to the fourth power, or about one time in 225 million. That leaves the Final Four. Since any seeds can meet there, we'll simplify things and just assume that all three games can go any way, leading to an eight-fold increase in the number of overall tournament brackets, or 1.8 billion.

Some people, not accustomed to thinking through probability, will point out that no bracket has ever turned out perfectly chalk (which is true) and suggest that a slightly off-chalk bracket is individually more likely (which isn't). One must not confuse "most likely" with "likely." The reason that each year the tournament turns out slightly off-chalk is that there are enormously more off-chalk brackets than chalk ones. (Strictly speaking, of course, there is only one perfectly chalk bracket.) The fact that these off-chalk brackets are slightly less likely than the pure chalk bracket is more than compensated for by their superior numbers.

Anyway, there are a lot of brackets filled out each year, so given that the odds on the chalkier ones are probably not too much worse than 1 in 1.8 billion, it wouldn't be surprising if somewhere along the way, someone did end up filling out a perfect bracket. As far as we know, however, it hasn't happened.

Note: By the way, if you've only a passing familiarity with sports, you may wonder what I mean by "chalk." Chalk is a sports betting term that refers to favorites winning; the more favorites win, the chalkier the outcome. I've heard that it came from betting lines on horse races, which were written up in chalk, but I've no good idea if that's actually so. Anyone?

EDIT: Here's a good write-up on how the term came to be.

Monday, November 26, 2012

Going Whole Ballhog

If you're one of the tens of readers who follow me, then unless the bottom of your rock doesn't carry ESPN, you've probably heard something about this kid from Grinnell who dropped 138 on a hapless Faith Baptist Bible College basketball team. Now, granted, this was a Division III basketball game—hardly the acme of organized basketball. Still, as Kobe Bryant said, "I mean, I don't care what level you're at, scoring 138 points is pretty insane." Jack Taylor is a household name now, people.

Rather predictably, there was some backlash, with some people claiming that it was rigged, or that it was selfish basketball, or at least not The Way That Basketball Should Be Played (because anything that portentous has to be written upstyle). I can't say anything as to whether it was rigged, although it didn't look like it to me, and as with any conspiracy theories, it's easy to say something like that when you don't have to offer any proof. All you have to do is throw out your hands and say, "It's common sense!"

But we can say something about whether it was selfish or bad basketball. Some folks have taken it upon themselves to make a virtue out of evenly distributed teamwork. That's fine as a matter of personal opinion, but they make a mistake, I think, who believe that it's an intrinsic virtue of basketball. It wasn't an intrinsic virtue of basketball when Naismith put up the first peach baskets, and until someone invents a game that makes teamwork an explicit scoring feature, there won't be a sport where it's an intrinsic virtue. (I also think that some of these folks could benefit from playing with a scoring phenom, just to see what it's like, but that's neither here nor there.)

What makes it a virtue—when it is a virtue—is that it makes a team more efficient, by and large. On the occasions when a player goes out and consciously attempts to score a bunch, it quite frequently turns out that the other players on the team are more efficient, and thus the team as a whole would have been more efficient if the offense had been more evenly distributed. This is a basic result from game theory.

But that didn't turn out to be the case here. Taylor scored 138 out of his team's 179 points. That's 77 percent. To get those points, of course, he used up a lot of his team's possessions: 69 percent, according to ESPN. It is a lot, but it shouldn't overshadow the fact that the rest of his team used up the remaining 31 percent of the possessions and ended up scoring only 23 percent of the points.

Let's see how that stacks up against two other phenomenal scoring performances of the past: Wilt Chamberlain's mythic 100-point night in Hershey, and Kobe's own 81-point barrage at home against the Toronto Raptors. (Taylor nearly had 81 just in the second half.) I'm going to ignore claims that the Warriors game was a farce in the second half, or that the Toronto Raptors were a defensive sieve; I'm only interested in the efficiency figures.

Chamberlain's Warriors scored 169 points that night, so Chamberlain scored 59 percent of his team's points, using (again according to ESPN) 47 percent of his team's possessions. Kobe's Lakers scored 122 points, so he contributed 66 percent of his team's points, while using (ESPN again) just 51 percent of the team's possessions.

One way to look at these feats is to consider how much more efficient the individual players were than the rest of the team. So, on a percentage basis, Taylor scored 77 percent of the points on 69 percent of the possessions, whereas the rest of the team scored 23 percent of the points on 31 percent of the possessions. Taylor, therefore, was (77/69) / (23/31) = 1.50 times as efficient as his teammates. Similarly, Chamberlain was (59/47) / (41/53) = 1.62 times as efficient, and Kobe was (66/51) / (34/49) = 1.87 times as efficient.

However, such a measure can easily be misleading. If someone plays a single minute, puts up a single three-pointer, and makes it, they might (as a normal example) have 3 percent of the team's points with only 1 percent of its possessions. By the same metric, such a player would be (3/1) / (97/99) = 3.06 times as efficient as his teammates. What's missing is some measure of the magnitude of the player's impact.

A more representative measure of the player's efficiency impact can be obtained by considering how efficient the team would have been if the other players had managed to use up all of their team's possessions, at the same efficiency they had been exhibiting. For instance, Taylor's teammates used up 31 percent of the possessions, scoring 23 percent of the points they eventually scored. If they had continued at that same clip, but used up 100 percent of the possessions, they would have eventually scored 133 points—about 74 percent as much as they actually did. To put it another way, the team with Taylor was 31/23 = 1.35 times as efficient as they would have been without him.

Using that as our guideline, the Warriors with Chamberlain were 53/41 = 1.29 times as efficient as they would have been without him, and Kobe's Lakers were 1.44 times as efficient as they would have been without him.

Just as a demonstration of how amazing all of these numbers are, if a team averages a true shooting percentage of 50 percent amongst four players, and the remaining player uses up half the possessions with a true shooting percentage of 70 percent, that team is only 1.20 times as efficient as they would be without that player. To increase their teams' efficiency as much as they did, these three athletes had to be remarkably efficient and prolific.

Friday, January 27, 2012

Shot Selection and the Secretary Problem

One of my favorite problems in all of recreational mathematics is the so-called secretary problem. In this problem, you are interviewing a hundred candidates for a secretarial position. For the purposes of discussion, we'll assume that the various candidates have a precise suitability rating, and of course, you want to maximize this rating for your hire. Ideally, then, you'd interview all hundred candidates first, get their ratings, and then hire the best one.

Unfortunately, that's not the way things work in this problem. You only get the candidates one at a time, and you have to decide then and there whether or not to hire them or not. Once you've rejected a candidate, they're lost to you forever. One could, theoretically, lose the best candidate on the very first interview.

The question then is, what is your best strategy, and what is your probability of making the best possible hire using that strategy?

(By the way, if you think this problem is formulated in a politically incorrect way, I first encountered it in a form called the sultan's dowry, in which a suitor for the sultan's daughters had to select the one with the largest dowry. If he picked the right one, he got to marry her, but if he didn't—well, let's just say an unsuccessful suitor and his head are soon parted. But it wasn't entirely unproductive; I eventually formulated a variation called the iterated sultan's dowry, in which a second suitor, seeing the first suitor's unsuccessful head roll down the hill, gets to use the information in choosing a prospective mate, and then the third, the fourth, etc. This variation has an interesting solution which is unfortunately too large to fit into this parenthetical comment.)

It can be shown, fairly easily, that the best strategy must be of the form "Skip the first n candidates, recording their suitability ratings. Then choose the next candidate whose rating exceeds theirs." The reason is that as you plow through the candidates, the probability that the best one is yet to come never increases, whereas the probability that you've already encountered the best one never decreases. So the question reduces to figuring out what the right choice for n is.

Ultimately, following a strategy like this, you could end up choosing no candidate at all if the best candidate is already in the first n, since you've already skipped all of those. But if you do pick a candidate, it will be number k > n.

For that one to be the best overall, the best must be in the last 100-n. Furthermore, the second best of the first k (that is, the best before the ultimate choice) must belong in the first n. Now, let's work out the probability that both of these happen. In order to do this, we have to break down the possibilities into all the different cases.

The first case is that the best candidate is the very next one—candidate number n+1—which happens with probability 1/100. You'll choose that one provided that there's no other candidate between the first n and number n+1 that is better than the first n. Since there are no candidates in between, that probability is 1. So the incremental probability for this case is 1/100 times 1, or just 1/100.

The second case is that the best candidate is the one after that—candidate number n+2—which again happens with probability 1/100. You'll choose that one provided that there's no candidate between the first n and number n+2 that is better than the first one. That will be true provided the best of the first n+1 happens within the first n, so the incremental probability for this case is 1/100 times n/(n+1).

Following the same line of reasoning, the third case—that the best candidate is number n+3—provides an incremental probability of 1/100 times n/(n+2), the fourth case provides an incremental probability of 1/100 times n/(n+3), etc., until the last case—that the best candidate is number 100—provides an incremental probability of 1/100 times n/99.

Putting all these cases together, this strategy "wins" with probability

n/100 × [1/n + 1/(n+1) + 1/(n+2) + · · · + 1/99]

It can be shown, using relatively straightforward calculus, that this expression reaches a maximum when n = 37, and yields a probability of success of about 0.37.

That's not a coincidence, incidentally. For large candidate pools (and a hundred candidates qualify as a large pool), of size N, the best strategy is to skip the first n = N/e, where e = 2.71828+ is the base of the natural logarithm, and to take the earliest best candidate thereafter. The approximate probability of success (that is, choosing the very best candidate of them all) is very close to 1/e = 0.36787+.

For a lot of people (including myself), that's rather stunning. It implies that even if you have a million candidates, you have a strategy that picks the very best one of them with better than a one-in-three chance.

The reason I'm putting basketball in the mix is that there's a fairly straightforward application to a vital aspect of scoring: shot selection.

Consider: A possession in basketball lasts for anywhere from 0 to 24 seconds (neglecting offensive rebounds). You can't always guarantee that you'll make the shot; the next best thing (at least before the endgame) is to select the very best shot—that is, the shot that has the best probability of going in (neglecting fouls and three-point shots).

In other words, ahem, optimal shot selection.

But you don't always know when that best shot is going to come, especially when you're working out of a halfcourt set. Is it the very first one? Is it the next best one? Maybe the best one will come at least twenty seconds into the possession. You just don't know. But maybe, now, you have a rule of thumb for selecting that best shot. You skip the ones that come in the first 24/e = 9 seconds (approximately), and take the next best one that comes thereafter.

Obviously, this rule makes lots of assumptions, such as (a) the defense is equally tenacious across the entire possession, (b) the offense is equally productive of shot opportunities across the entire possession, (c) the best shot opportunity is equally likely to come at any time during the entire possession, etc. But to the limited extent that these assumptions are approximately valid, it's not a bad rule of thumb. It suggests that the Phoenix Suns of the early-to-mid-2000s were a bit hasty.

But not by much. Just a second or two.

Friday, December 9, 2011

It's True

This man has no gut.

Thursday, February 17, 2011

A Little Learning (Game Theory, Part Deux)

Here, as promised, is the dangerous thing.

Suppose you're getting a sequence of playing cards, and you're trying to figure out some statistics for the playing cards. At first, the cards seem utterly random, but after a while, a pattern emerges: There are slightly more black face cards than red ones, and there are slightly more red low-rank cards than black ones. You're a statistician, so you can quantify the bias—measure the correlation coefficient between color and rank, estimate the standard error in the observed proportions, and so forth. There are rigorous rules for computing all these things, and they're quite straightforward to follow.

Except, you're playing gin rummy, and the reason you're receiving a biased sequence of cards is that you're trying to collect particular cards. If you change your collection strategy, you'll affect the bias. You may have followed all the statistical rules, but you've forgotten about the context.

It might seem entirely obvious to you, now that I've told you the whole story, what the mistake is, and how to avoid it, but I contend that a wholly parallel thing is happening in sports statistics. I'm going to talk about basketball again, because I'm most familiar with it, but the issue transcends that individual sport, yes?

I've previously touched upon this, but this time, with the first post on game theory as background, I'm actually going to go through some of the analysis. Again, we won't be able to entirely avoid the math, but I'll try to describe in words what's going on at the same time. If calculus makes you squeamish, feel free to skip the following and move down to the part in bold.

In our simple model, the offense has two basic options: have the perimeter player shoot the ball, or pass it into the post player, and have him shoot the ball. The defense, in turn, can vary its defensive pressure on the two players, and it can do that continuously: It can double team the perimeter player aggressively, double the post player off the ball, or anything in between. We'll use the principles of game theory to figure out where the Nash equilibrium for this situation is.

We'll denote the defensive strategy by b, for on-ball pressure: If b = 1, then all of the pressure is on the perimeter ball-handler; if b = 0, all of it's on the post player. An intermediate value, like b = 1/2, might mean that the defense is equally split between the two of them (man-to-man defense on each), but the exact numbers are not important; the important thing is that the defensive strategy varies smoothly, and its effects on the offensive efficiency also vary smoothly.

Each of the two offensive options has an associated efficiency, which represents how many points on average are scored when that player attempts a shot. We'll call the perimeter player's efficiency r, and the post player's efficiency s. As you might expect, both efficiencies depend on the defensive strategy, so we'll actually be referring to the efficiency functions r(b) and s(b). The perimeter player is less efficient when greater defensive pressure is placed on him, naturally, so r(b) is a decreasing function of b. On the other hand, the post player is more efficient when greater defensive pressure is placed on the perimeter player, so s(b) is an increasing function of b.

Now let's look at this situation from a game theory perspective. Will the Nash equilibrium of this system involve pure strategies, or mixed strategies? (A pure defensive strategy in this instance consisting of either b = 0 or b = 1.) Right away, we can eliminate the pure strategies as follows: If the offense funnelled all of its offense through one of those players, and the defense knew it, they would muster all their defensive pressure on that player. On the other hand, if the defense always pressured one of the players, and the offense knew it, they would always have the other player shoot it. Since those two scenarios are incompatible with one another, the Nash equilibrium must involve mixed strategies. Our objective, then, is to figure out what those mixed strategies are.

The offensive mix, or strategy, we'll represent by p, the fraction of time that the perimeter player shoots the ball. The rest of the time, 1-p, the post player shoots the ball. The overall efficiency function of the offense, as a function of defensive strategy b, is then

Q(b) = p r(b) + (1-p) s(b)

The objective of the defense, in setting its defensive strategy, will be to ensure that the offense cannot improve its outcome by varying its strategy p. That is, it will set the value b such that the partial derivative of Q with respect to p (not b) is equal to 0:

∂Q/∂p = r(b) - s(b) = 0

which happens when r(b) = s(b)—in other words, when the efficiencies of the two options are equal. The offense, in setting its strategy p, will aim to zero out the partial derivative of Q with respect to b:
∂Q/∂b = p r'(b) + (1-p) s'(b) = 0

which happens when

p = s'(b) / [s'(b) - r'(b)]

where b is taken to be the point where the two efficiency curves meet, since the offense knows the defense will play there.

But let's not worry about the offensive strategy; the important thing to take away is that at the Nash equilibrium, the defense will adjust its pressure until the efficiencies of the two offensive options are equal. Let's show what that looks like graphically.

We'll see here how game theory tells us what should be common sense: If the current defensive strategy were somewhere else than at the Nash equilibrium—say, if it were further to the left—the offense could improve its outcome by shifting more of its offensive load to the perimeter player, since he's the more efficient option on the left side of the graph. The reverse holds on the right side of the graph. Only at the point where they cross is the offense powerless to improve its situation by changing its offensive mix, which is exactly the outcome the defense wants.

As a corollary, the exact location of the Nash equilibrium depends vitally on the efficiency functions of the offensive components. If, for instance, one of the efficiency functions drops, the observed efficiency of the offense (that is, the efficiency measured by statistics) will also drop. Let's take a look at that graphically:

In this figure, the efficiency function of the post player, represented by s(b), has dropped. This has the effect of sliding the Nash equilibrium point down and to the right, which indicates increased ball pressure and a decrease in the observed efficiency of both the post player and the perimeter player. It's important to recognize that the efficiency function of a player refers to the entire curve, from b = 0 to b = 1, but when we gather basketball statistics, we merely get the observed efficiency, the value of that curve at a single point—the point where the team strategies actually reside (in this case, the Nash equilibrium).

Consider: Why might the efficiency function of the post player drop, as depicted above? It might be because the backup post player came in. It might be because a defensive specialist post player came in. In short, it might be because of a variety of things, none of which have to do with the perimeter player and his efficiency function—and yet the perimeter player's observed efficiency (whether we're talking about PER, or WP48, or whatever) drops as a result.

There's nothing special about the perimeter player in this regard; we would see the same effect on the post player if the perimeter player (or his defender) were swapped out. In general, the observed efficiency of a player goes up or down owing, in part, to the efficiency function of his teammates.

We see here an analogy to the distinction, drawn in economics, between demand and quantity demanded. Suppose we see that sales of a particular brand of cheese spread have dropped over the last quarter. That is to say, the quantity demanded has decreased. Does that necessarily mean that demand itself has dropped? Not necessarily. It could be that a new competing brand of cheese spread has arrived on the market. Or, it could be that production costs of the cheese spread have increased, leading to a corresponding increase in price. Both of these decrease the quantity demanded, but only the former represents a decrease in actual demand. Demand is a function of price; quantity demanded is just a number. If all we measure is quantity demanded, and we ignore the price, we haven't learned all we need to carry on our business. As economists, we would be roundly criticized (and rightly so) for neglecting this critical factor.

We are, in the basketball statistics world (and that of sports statistics in general), at a point where all we measure is the number. We don't, as a rule, measure the function. We apply our statistical rules with rigor and expect our results to acquire the patina of that rigor. But we mustn't be hypnotized by that patina and forget what we are measuring. If our aim is to describe the observed situation, then the number may be all we need. But if our aim is to describe some persistent quality of the situation—as must be the case if we are attempting to (say) compare players, or if we are hoping to optimize strategies—then we are obligated to measure the function. Doing so is very complex indeed for basketball; there are an array of variables to account for, and we have at present only the most rudimentary tools for capturing them. It is OK to punt that problem for now. But in the meantime, we must not delude ourselves into thinking that by measuring that one number, we have all we need to carry on our business.

Friday, January 28, 2011

How to Be Wrong, With Statistics!

Please, just stop it. You're hurting me.

Anyone who understands statistics at all cannot dispute that Kobe Bryant does not perform well statistically, in the clutch. But anyone who understands statistics well cannot dispute that the current statistics are woefully under-equipped to discern who is the clutchiest player in the league.

Look: Nothing happens in a vacuum. We look at crunch-time statistics because it's the most exciting part of the game, when it happens. But it's only one way to condition a play.

What do I mean by condition? I mean "to restrict the characteristics of." With respect to comparing players on their clutchiosity, the objective should be to condition the crunch-time plays sufficiently that we are comparing apples to apples, and oranges to oranges. And here, as with many other aspects of basketball, we simply don't have the statistics to do it at our disposal.

For instance, suppose that we wish to compare two players, A and B. Suppose that A's offensive efficiency (points per possession) is greater than B's, with less than 24 seconds on the clock and the team tied or down no more than three points. Does that mean that A is clutchier than B?

Not at all. If B has stiffs for teammates, compared to A, then he's likely going to be faced with tighter individual defense than A, and likely earn a lower offensive efficiency than A. That's a couple of instances of "likely" in there, but the point doesn't have to be ironclad, it just has to be plausible, even probable. We just don't know enough to conclude with anything approaching certainty that A is clutchier, because we haven't conditioned on the teammates. (Or the defense, for that matter.)

Observe that this is mostly independent of what statistic you use to measure clutchiness. Suppose, instead, that you decide to use win probability increment. A player's ability to increase his team's likelihood of winning is still going to be affected by his teammates: If he passes, they will have a lower probability of scoring; if he doesn't, the defense can afford to defend him more tightly.

Of course, maybe you're OK with this kind of quality vacillating with things like which teammates a player has. But personally, I think such a measure has a certain ephemeral aspect that we don't usually associate with clutchiness.

The problem is, how can you possibly condition on the kind of teammates that a player has? Players don't change teammates the way they change their clothes (or at least they shouldn't). So what do you do?

Here's my gentle suggestion: Stop trying to answer these abstract questions statistically. I've been using outlandish forms of the word "clutch" to underscore this, in case you haven't noticed, but my point is serious. Use statistics to answer the questions they can. As the field advances, we'll be able to answer more of these questions, but in the meantime, use the same method we've been using all along: subjective observation. Western civilization didn't break down before we had PER. Nothing hinges on who people outside the game think is clutch. And mostly, stop pretending to any degree of certainty in the matter, just because a number is attached to it.

EDIT: Since I'm a fan of Kobe Bryant, one might reasonably wonder whether or not I've got a built-in bias against crunch-time statistics, since almost all of them (except perhaps a raw count of shots made in crunch time, as opposed to efficiency) point to quite a few players as being superior in the clutch. Obviously, I can't deny said bias. Quite possibly I would not be making these same arguments, or making them with quite the same degree of vehemence, if those statistics showed Bryant in a better light.

That being said, however, I don't think the question of using statistics to examine clutchitude should be predicated on how well they accord with conventional wisdom (where Bryant is, indeed, king of clutch). In my opinion, there are quite compelling fundamental arguments that straightforward linear classifiers such as PER or offensive efficiency or wins produced, conditioned on crunch time or not, are simply not reliable indicators of individual performance, and those arguments would remain valid regardless of whether I espoused them, or of whom they revealed to be the top performers, in crunch time or in the game overall.

Wednesday, January 5, 2011

Voter Mixing Equals Criterion Mixing

I'm going to talk about basketball and probability again. Wasn't that obvious from the title of this post?

It's apparently never too early to talk about the MVP award for the NBA. We're coming up on the halfway point of the season, and writers have been tracking the MVP candidates for, oh, about half a season. Nobody takes them seriously until about now, though.

One side effect of the question being taken seriously is that some wag will point out that the MVP is not—and has never been—defined precisely. In fact, I can't find anywhere where it's been defined at all by the NBA, precisely or otherwise. That leaves the voters (sportswriters and broadcasters, mostly, plus a single vote from NBA fans collectively) to make up their own definition, a situation that said wag invariably finds ludicrous.

Well, here's one wag that finds this situation perfectly acceptable. Desirable, even.

Listen: There is no way that everybody will ever agree on a single criterion for being the "most valuable player." Most valuable to whom? The team? The league? The fans? Himself? (I can think of a few players who certainly aim to be most valuable to themselves.) And what kind of value? Wins? Titles? Highlights? Basketball is entertainment, after all. There are just too many different ways to evaluate players.

Instead, we might imagine that some writers would get together at some point and define MVP as a mixture of criteria. For instance, the title of MVP could be based in equal parts—or inequal parts, for that matter—on individual output, contributions to team success, and entertainment value.

Except, I'd argue that that is exactly what we've been doing for all these years. We have all these voters, all of whom have differing ideas of what the MVP does (or should) stand for. Some people think it should be based on individual statistics (Hollinger's Player Effectiveness Rating, or PER, is a current favorite). Some people think it should be based, at least in part, on team success, so team wins are an input to the decision (a 50-win minimum is a popular threshold). Still others dispense with explicit criteria altogether and vote based on reputation or flash.

Well, if exactly the same number of voters take each of those different perspectives on MVP, then we will have an MVP based in equal parts on individual output, contributions to team success, and entertainment value. And if more voters lean on individual output than on entertainment value, then the MVP make-up will show that same leaning. Voter mixing equals criterion mixing!

What's more, this criterion mixing is automatic. No committee needs to be formed, and the exact mixture evolves as the voter population evolves. If someday team success becomes more important to the basketball cognoscenti, then it'll automatically have a larger impact on MVP selection. No redefinition is necessary.

Can this equivalence be demonstrated on any kind of formal level? In something as complex as basketball, my guess is not. But it's close enough, and intuitive enough, that I think it just doesn't make sense to gripe about the MVP lacking a precise definition. As long as each voter comes to their own decision about what it stands for, we'll get the mix that we should.

Wednesday, November 24, 2010

Too Many Damned Monkeys

What do you need more monkeys to do: (a) guarantee the writing of all of Shakespeare's plays, or (b) be able to sink an infinite number of basketball shots in a row? OK, I realize that this is entirely inconsequential, but it actually came up a couple of days ago in what would otherwise have been fairly ordinary coffeehouse conversation, so let me bring you up to speed.

The anchor point is the notion that by having an infinite number of monkeys, each of them sitting in front of a typewriter, randomly typing away, you could guarantee that one of them would surely generate a perfect typescript of Hamlet. Or Macbeth. On the other hand, you'd also guarantee that one of them would generate a "perfect" version of Astrology for Dummies.

What this is really about (since few of us are likely to corral together an infinite number of monkeys) is the so-called cardinality of possible books of arbitrary (but finite) length. Now what's cardinality? The cardinality of a finite set is simply the number of things in the set. So, for example, the cardinality of the U.S. Supreme Court justices is nine, usually. The cardinality of the English alphabet is 26. And the cardinality of the sand grains on the Earth is some almost unimaginably large number. But it's still finite.

Infinite sets are a whole 'nother kettle of fish. Maybe the simplest example of an infinite set is ℕ, the set of natural numbers: 0, 1, 2, ... We use the ellipsis (...) to indicate that the natural numbers go on, forever, without end. There is no last number; in other words, infinity is not really a number in the usual sense. Nonetheless, we might say that the cardinality of ℕ is infinity, which is conventionally denoted ∞.

But in so doing, we would be ambiguous, for as it turns out, there are different varieties of infinity. The infinity of ℕ is the smallest possible infinity, but there are larger infinities. That sounds kind of paradoxical: How could a set go on longer than forever?

Well, let's see if we can construct an infinity that's larger than the cardinality of ℕ. The first thing we might do is add some more numbers to ℕ and see if that yields a set with larger cardinality: we might add in all the negative whole numbers, to get ℤ, the set of all integers. Shouldn't ℤ, which is (naively) almost twice as big as ℕ, have nearly twice as large a cardinality?

No, and here we run into one of the fundamental differences between finite sets and infinite sets. Suppose we divide ℕ into two mutually distinct subsets: O (1, 3, 5, ...) and E (0, 2, 4, ...). Intuitively, both O and E are infinite sets. But if ℕ is the union—the sum set, so to speak—of O and E, is ℕ then doubly infinite?

Mathematicians decided that was too much. So cardinality is defined, less intuitively but more consistently, as follows. We say that the cardinality of the English alphabet is 26, because there are 26 letters in the alphabet. Another way of saying the same thing is that the letters of the alphabet can be placed into a one-to-one correspondence with the set of numbers from 1 through 26: 1-A, 2-B, 3-C, and so on, up to 26-Z. You can try a similar exercise with the U.S. Supreme Court justices.

If we define the notion of cardinality this way, then it follows that two sets have the same cardinality if there exists a one-to-one correspondence between the sets. Somewhat amazingly, then, the set of odd numbers O has exactly the same cardinality as ℕ, because one can define a one-to-one correspondence that matches each number in ℕ with a number in O, and vice versa: 0-1, 1-3, 2-5, 3-7, ..., in each case pairing a number n from ℕ with the number 2n+1 from O. It doesn't matter that one can define a correspondence in which the two sets don't match one-to-one; all that matters it that there exists at least one correspondence where they do match.

Pretty clearly, we can do the same thing with E, matching n from ℕ with 2n from E. So all three sets—ℕ, O, and E—have the same cardinality, even though O and E combine to make up ℕ. The question then arises: Are there infinite sets that can't be matched up one-to-one with ℕ, no matter how you try? We can certainly do that for ℤ, matching up all odd numbers m in ℕ with (-1-m)/2 from ℤ, and all even numbers n in ℕ with n/2 from ℤ.

Well then, what about ℚ, the set of rational numbers—all possible fractions involving only whole numbers in the numerator and denominator? Surely that is a bigger set. But as it turns out, ℚ also has the same cardinality as ℕ, even though there are an infinite number of possible numerators and an infinite number of denominators. This state of affairs has led people to write such semi-sensical equations as

∞ + ∞ = ∞

since O and E combine to make ℕ, and

∞ × ∞ = ∞

since all the infinite pairings of ℕ make up ℚ. (By the way, in case you're wondering, ℕ stands for Natural Numbers, of course; ℤ stands for Zahlen, the German word for number; and ℚ stands for Quotient.)

All right, what about ℝ, the set of real numbers? Can that set be placed into a one-to-one correspondence with ℕ? Based on the way things have been going, you might suppose that they could, but in 1891, the German mathematician Georg Cantor (1845-1918) showed that in fact they could not, that ℝ was a strictly larger set than ℕ.

His argument was clever one, employing proof by contradiction. Suppose, Cantor said, that you could find such a one-to-one correspondence. You could write out a catalogue of real numbers then, as follows:

1 - 0.14159265...
2 - 0.71828182...
3 - 0.41421356...

and so forth. Now, suppose you construct a new number g, using the following process: The first digit of g will be the first digit of the first number in your catalogue, plus one; the second digit of g will be the second digit of the second number, plus one; the third digit of g will be the third digit of the third number, plus one; and so on. We could read out g along a diagonal in our catalogue of real numbers, like this:

1 - 0.24159265...
2 - 0.72828182...
3 - 0.41521356...

So g would be the number 0.225... This number g has an amazing property—it cannot appear anywhere in our catalogue of real numbers. Why not? Because it differs from the first number at the first digit, it differs from the second number at the second digit, it differs from the third number at the third digit, ... in short, it differs from every single number in the catalogue.

We have a contradiction: Either g is not a real number, or our catalogue is not complete as we thought it was. Well, g is clearly a real number, so the problem must lie with the other part—our catalogue is not complete. After all, we only assumed we could create such a catalogue. Since it seems we cannot, no one-to-one correspondence exists between ℝ and ℕ.

You might think that there's a simple way around this, if we simply add g to our catalogue, or rearrange it in some way. But Cantor's diagonalization argument, as it is usually called, would apply just as well to this new catalogue. No matter what catalogue you attempt to compile and amend, there's no way to avoid the construction of a real number that's nowhere in the list. Those two sets fundamentally have different cardinalities. And because of that, we can't use the single symbol ∞ to denote their cardinalities. Instead, mathematicians use the aleph-numbers: The cardinality of ℕ is ℵ0 (pronounced "aleph-null"), and under certain commonly held assumptions, that of ℝ is ℵ1 (pronounced "aleph-one").

So what about all those scripts for Shakespeare? Each of them can clearly be entered into a computer document, which is represented by a finite string of digits in the computer. We can therefore place the set of possible scripts into a one-to-one correspondence with the integers in ℕ, meaning that the set of scripts has cardinality ℵ0, so ℵ0 monkeys would be enough for at least one monkey to write any given script. (In fact, ℵ0 monkeys would write that script.)

But what about the infinite string of makes in a basketball game? These are infinitely long strings of basketball shots (each one with ℵ0 shots), so there would be a one-to-one correspondence between those strings and infinitely long sequences of digits—i.e., ℝ, the reals. So it would take ℵ1 monkeys to guarantee that at least one monkey would shoot any given sequence (in particular, the one sequence consisting of all makes).

I don't even want to know about the bananas.